May 30, 2014

The single best assessment I have seen of the ongoing TrueCrypt debacle is the page put together by Steve Gibson at Gibson Research. Steve has done a great service to the community by summarizing the confusing situation surrounding TrueCrypt, and by providing a trusted archive of the final release of TrueCrypt 7.1a. You can access Steve’s page here:

https://www.grc.com/misc/truecrypt/truecrypt.htm

In addition to Steve’s work, there is also a group in Switzerland that is aiming to take the pieces of the TrueCrypt project forward. It is obviously at a very early stage so we shall have to wait and see how that effort plays out. See here for more info:

http://truecrypt.ch/

In spite of the recent upheaval, the TrueCrypt auditing project appears to be moving forward with Phase 2 of it’s audit of the TrueCrypt 7.1a code. You can keep tabs on that effort here:

http://istruecryptauditedyet.com/

So, in spite of everything, it seems TrueCrypt may yet have a future. At this point, it’s anyone’s guess as to how bright that future might be?

My Assessment
As anyone can see from this blog, I have been a big proponent of TrueCrypt over the years. No other encryption software I have seen has worked as smoothly and easily as TrueCrypt to enable the average Windows user to plug their single biggest security hole, and to do so with minimal cost, effort and technical knowledge. Indeed, with TrueCrypt around, there really was no excuse for NOT encrypting your hard drive!

Now all that has changed!  The funeral pyre the TrueCrypt developers lit on Wednesday has put a cloud of distrust over TrueCrypt and to some extent the broader world of open source codes in general. The job of selling the average user on the value of using full disk encryption has always been difficult, but with Truecrypt it was possible to change some minds because the cost, level of effort, and level of expertise required to implement it were within reach of just about everyone. There really were no excuses. The events of Wednesday however vaporized my “no excuses” argument and made it at least 10x more difficult to convince anyone to adopt and use full disk encryption. I can already here the excuses:

“But to use BitLocker I have to upgrade all my machines and that’s going to cost me money I don’t have right now!”
“I don’t see the point, Microsoft is already in cahoots with NSA anyway.”
“If someone wants my stuff there gonna get it no matter what I do!”
“TrueCrypt? My friend told me that it wasn’t to secure.”
“I don’t trust open source. How can you trust something if you’re not paying for it?”

Sadly, even if TrueCrypt does somehow manage to rise like a Phoenix from the ashes, I don’t see these issues going away anytime soon.

What to do?

The bigger problem right now though is what advise to dispense to those who drank the FDE / TrueCrypt Kool Aid?

Steve Gibson is a smart and trustworthy guy. He seems to think that version 7.1a is just fine. He might be right and I would really like to believe that too! Unfortunately, there’s a nagging voice inside my head that keeps saying that maybe there’s more to this than we know? Maybe it isn’t secure after all?

So, how long should TrueCrypt users hold out? Should they migrate now or wait until we know more? Maybe you should hold off until the TrueCrypt security audit is finished in the fall? Or perhaps until one of the new TrueCrypt based projects takes off? No matter how you cut it, it’s a tough decision. From my vantage point, there is no clear answer right now and it comes down to an individual judgment call. I’ve written out a few considerations below to help you make your call.

If you are using TrueCrypt to protect your own personal files, and you have no reason to believe that there is a serious adversary (e.g. NSA) after you, then you might be well served by continuing to use TrueCrypt 7.1a until such time as we have more definitive information. I would advise diligence in staying abreast of the latest developments however. You should also have a migration plan worked out so that you’re ready to move if things go even further south.

If you are using TrueCrypt to protect data that belongs to others, and for which you might hold some legal liability, then, in my opinion, you might want to consider switching to something else sooner rather than later. Yes, it may take time and money to make the switch, but in the big scheme of things you may be better off spending that money than having it argued in court that you continued to use a piece of software that was suspect. Here, a vetted solution such as a FIPS-197 or FIPS-140-2 validated hard drive would make a lot of sense! Similarly, use of something like Microsoft BitLocker or Apple FileVault will get you away from the cloud of TrueCrypt uncertainty and likely leave you in better shape with the legal system should the worst happen. Please note, I’m not an attorney so don’t take any of this as legal advise. It is just my best guess as to how to approach the problem.

If you are one of the millions of users still using Windows XP, then it’s time to completely jump ship and fast! Yes, TrueCrypt was the best friend Windows XP users ever had, but there is now a cloud over both systems. I don’t care whether you buy a new PC or Mac, upgrade to Windows 7 or 8, or make a dive into Linux. The time to do something is now! Just make sure to use whatever disk encryption system is available on your new setup, and properly deal with the remaining data on the old PC. In other words, don’t just shuffle it off to the thrift shop or the trash without securely wiping or completely destroying the drives!

That’s all for now. This situation is very fluid so stay on top of it! Thanks for reading!

JR

May 31, 2014 – Full Disclosure

In the interest of full disclosure, I thought that should mention that I am no longer a TrueCrypt user, and I haven’t been one for several years now. The reason for this is that my primary operating system is Linux. I use Linux native tools (dm-crypt/LUKS/LVM) to encrypt all my information and swap space. Additionally, I only use  Microsoft Windows operating systems within a virtual machine running under Linux. Because the Linux drives are encrypted, there is no need for Windows native encryption such as TrueCrypt or BitLocker.

That said, up until 3 days ago, if I had been using Windows on “bare metal” hardware I would have most definitely been using TrueCrypt as well. That statement then begs the hypothetical question: “Would I continue to use TrueCrypt today, given what has transpired this week?” The short answer is “No! I would not!” The longer answer would be “No, I would not use TrueCrypt, but then again, I am fully aware of, and comfortable using, an alternative solution (e.g. encrypted Linux with VMs as described above). And, I am also confident that this solution solves both my computing needs and my need to keep information secure using open source encryption algorithms.”  I sincerely doubt that this would be the case for the majority of TrueCrypt refugees! Indeed, not all computers are powerful enough to run multiple operating systems, and not all programs run well, or at all, in a Virtual Machine. Furthermore, not all users are knowledgeable enough, or motivated enough, or have the time to learn a second OS and associated virtualization software. In short, just because I would stop using TrueCrypt doesn’t mean that you should too!

 

 

Update May 30, 2014

Please see my post here for the latest on the TrueCrypt debacle.

 

Update 7:41 PM MDT May 28, 2014

The Truecrypt situation is still incredibly murky, but it seems that there is a very real possibility that the Truecrypt project might really be dead. Even it it’s not dead, at this there is a huge trust deficit so, sadly, one must conclude that the days of Truecrypt are pretty much over.

If you are currently using Truecrypt, I recommend that you begin working on a migration strategy immediately. I don’t have a slam dunk alternative to Truecrypt because none exists. Some options you might consider would include hardware encrypted hard drives, migrating to Linux and using dm-crypt/luks, or taking the advise on the Truecrypt site and using Microsoft Bitlocker or Apple FileVault. Your choice will depend on what you do and who you trust. Any of them are likely better than opting to go without disk encryption.

 

Original Post

Synposis

As of this afternoon, May 28, 2014, the site www.truecrypt.org and it’s associated project at source forge have changed significantly. It is unclear if the changes are legitimate or the result of a compromised account.

At the top of the page on the site there is the following warning:

WARNING: Using TrueCrypt is not secure as it may contain unfixed security issues

This page exists only to help migrate existing data encrypted by TrueCrypt.

The development of TrueCrypt was ended in 5/2014 after Microsoft terminated support of Windows XP. Windows 8/7/Vista and later offer integrated support for encrypted disks and virtual disk images. Such integrated support is also available on other platforms (click here for more information). You should migrate any data encrypted by TrueCrypt to encrypted disks or virtual disk images supported on your platform.”

 

The page goes on to give instructions for migrating from Truecrypt to alternative disk encryption methods such as BitLocker, Filevault, etc. There link at the bottom of the page to download version 7.2 of the Truecrypt code.

The situation is murky at this point, but the most likely explanation is that the site and/or one of the developers has been hacked. I note that the Truecrypt code passed Phase 1 of a security audit being performed by an independent 3rd party in April. I find it difficult to believe that the developers would suddenly abandon ship and point users to closed source commercial codes.

Our Recommendations

1. Do NOT panic!

2. Wait until we have further information that has been confirmed by multiple sources before taking any action. Meanwhile, do NOT follow the instructions on the truecrypt website, and do NOT download and install any code from the site.

3. Stay tuned! If you use Truecrypt and it turns out there is a problem you’ll want to know about it. Look for more information here or on twitter @securitybeacon as we try to sort out what’s happening!

 

 

 

 

Last month Canonical released version 14.04 of it’s well-known Ubuntu Linux distribution. The Trusty Tahr is a Long Term Service (LTS) release and carries 3 to 5 years of upgrades and support depending on which flavor (e.g. Unity, Xubuntu, Lubuntu, Ubuntu Gnome) you choose.

I’ve used Ubuntu almost exclusively in my small businesses since 2007, so I know many of its advantages and limitations. In my experience, the new releases of Ubuntu always have some substantial improvements over earlier versions, but they also include a number of bugs and changes that must addressed to make things run smoothly. As  a general rule, the LTS versions of Ubuntu are more reliable and have fewer problems than the regular releases the occur on 6 month intervals. With but a few exceptions, I have tried to stick with the LTS releases so that I can avoid the frequent upgrade cycle and some of the bugs that tend to creep into the regular releases. I’ve lived comfortably with Xubuntu 12.04 (AKA Precise Pangolin) for nearly two years now. In that time it has proven itself very robust and reliable for all my needs. Nonetheless, I was very interested to try the 14.04 release, so last month I upgraded several of my computers to the newer OS. I also performed a fresh install on my laptop computer.

For those considering an upgrade, I have provided some notes that I made while experimenting with Trusty on my systems.

Installer Bug with Manual Disk Partitioning and Encryption Setup

I use full disk encryption on all my systems, and over the years I have developed a very particular way that I like to set up the encrypted LVM. I am sorry to report that there is a bug in the Xubuntu installer (and presumably other Ubuntu installers) that prevents one from configuring a system from scratch using a manually configured encrypted LVM setup. While the automated encrypted LVM setup works it may not give you the setup you want. I’m hopeful that the developers will be pushing to fix this bug soon!

Improved support for nVidia Optimus

Many modern notebooks have dual Intel/nVidia graphics cards. These “Optimus” setups have been famously problematic in Linux. Trusty implements software known as nVidia Prime that seeks to improve performance and cut power consumption as documented here. While I believe nVidia Prime is definitely a step in the right direction, I did not find it much faster or, more importantly, easier to control, than the Bumblebee setup I have used on 12.04. My tests were done with Windows 7 running in Virtual Box and with my principal engineering application: Remcom X-FDTD. Even though nVidia Prime is newer, it does not allow the user to decide on the fly which applications run with the power-hungry nVidia GPU as does the older Bumblebee software. nVidia Prime also forces a reboot or restart of the x-server anytime you want to switch between the nVidia and Intel GPUs. Even though nVidia prime is an improvement of sorts, I didn’t find it a compelling one – yet!

Improved support for Android phones

I have a Samsung Rugby Pro smart phone and it’s never played well with Xubuntu 12.04. I know there are people out there that claim that it can work, but my experience is that the hacks and work-arounds are neither smooth nor reliable. With Xubuntu 14.04 however it was simply plug and play! The phone connected first time and without a fuss. It actually worked better than the setup I had used with Windows 7 in VMWare!

Faster kernel?

I didn’t do any quantitative tests, but my user experience with Trusty seemed to be smoother and faster than what I experience with Precise. It’s not a night and day difference, but it’s definitely there. I noticed this on both upgraded and fresh installs so I believe it  is a result of improvements in the kernel, but there are likely other factors involved as well.

VMWare Kernel Module Issue

I attempted to use VMWare Player 5.02 on Trusty but ran into a kernel module issue. This might be resolved by later versions of VMWare. I did not investigate further since I have used Virtual Box lately. If you’re a VMWare user you probably should confirm that the issue is resolved before upgrading to Trusty.

Virtual Box now supports USB webcams!

I have grown frustrated with VMWare and last year I spent several days attempting to switch from VMWare to Virtual Box, only to discover that Virtual Box would NOT work with my Logitech Quickcam Pro 9000 USB webcams. This was a deal killer and I went back to VMWare until I began testing Trusty last month. I’m now pleased to report that the latest versions of Virtual Box have solved the USB webcam issue! I am now able to use Oovoo and Skype with the Logitech USB webcam within a Virtual Box VM. With Trusty you will automatically get access to the latest Virtual Box software that solved the webcam problems. If you want to hold off upgrading to Trusty, you can still download and install the latest Virtual Box from Oracle and use it in 12.04.5.

Latest LibreOffice

Trusty has a much never version of LibreOffice. I have not done extensive testing on this but I would expect much improved support for the Microsoft .docx formats. This is a serious consideration if you regularly exchange files with Windows users.

Support for TRIM feature on SSDs

Supposedly, TRIM is enabled by default for Intel and Samsung SSDs in the new Trusty kernel. This is likely a very good thing if you’re using SSDs. I did not notice any difference in performance in my non-quantitative testing using Intel SSDs.

Hibernate and suspend issues

Historically, hibernate and suspend functions have been notoriously problematic in Linux. With Xubuntu 12.04 however I have been able to use both functions reliably on a variety of notebook and desktop computers. With 14.04 it seems that the good days are over and I was left with systems that had troubles with hibernate or suspend. This was the case whether I performed an upgrade or a fresh install.

Network manager can’t import OpenVPN config file

If you depend on OpenVPN, you will find that Trusty disappoints in a big way. After upgrading, I was not able to import OpenVPN configuration files into network manager. I was forced to manually configure OpenVPN using my old 12.04 setup as a guide. This bug has been resolved, but the changes are still residing in Trusty Proposed.

Random desktop freezes

One of the most frustrating parts of my experience with the Trusty Tahr was the frequent and random desktop freezes. I noticed this on both my desktop and notebook systems. These freezes were the last straw and lead me to backtrack to 12.04. I’m not going to make excuses for Canonical, but I will note that I use nVidia graphics cards and their proprietary drivers on all my machines so it is conceivable that the x-server problems are rooted with nVidia and not Canonical.

Summary

After spending about two weeks attempting to work through and around some of its issues, I determined that Trusty Tahr wasn’t yet ready to run my critical business systems. I have since reverted to 12.04 until such time as the bugs have been fixed. That doesn’t mean I’m giving up on Ubuntu though. Canonical has as done some fine work over the years and I’m sure most of these issues will be fixed by the time of the point release 14.04.1 in a few months.

In the meanwhile, if you’re looking to try Linux as an alternative to Windows XP, I recommend that you avoid 14.04 and go with the older and reliable 12.04 release instead.

Thanks for reading!

JR

Update May 30, 2014

Please see my post here for the latest on the TrueCrypt debacle.

Update May 28, 2014

The Truecrypt website is in a state of flux. We do NOT recommend using Truecrypt at this time. Please check back often until we have more clarity on the Truecrypt situation.

Original Post

Religious use of encryption is the key to keeping your data secure whether it is at rest or in motion. One of the biggest risks facing consumers and small businesses is the loss or theft of a computer or hard drive containing confidential or proprietary information. This risk is particularly acute for those responsible for data covered under HIPPA and other such legislation. Most health workers are critically aware of their responsibilities to protect PHI (patient health information), but many may not know that the law provides safe harbor to a “covered entity” only if all PHI is encrypted using a FIPS 140.2 validated algorithm. I put the word validated in bold to emphasize and differentiate it from the similar sounding but fundamentally different term called FIPS 140.2 compliant algorithm. Many encryption programs and systems use algorithms that would be considered compliant with the FIPS 140.2 standard, but only those devices and systems which have been tested and validated by NIST can claim full protection under the HIPPA related laws. It costs a company a considerable amount of time and money to get a product validated so such devices and systems naturally command a premium in the marketplace. If you are in a regulated industry however you have little choice but to pony up for the government sanctioned encryption.

With the recent revelations of Snowden, some of you (myself included) may have doubts as to whether government validated encryption is actually the same thing as “the best available encryption”. Snowden documents revealed that the NSA has made attempts to influence NIST and it’s choice of algorithms and encryption standards. As a result, the FIPS validated “stamp of approval”no longer carries much weight for those paranoid about NSA snooping. Now, I will freely admit that this website was never intended to help protect anyone from the likes of NSA and other state sponsored spy agencies. Their resources so vast and expertise so deep that there is almost nothing you can do to keep your information safe from them. However, it a reasonable goal to at least make life very difficult for just about every other adversary (e.g. hacker and thief) you might face. In that light, FIPS 140.2 validated encryption is still probably the best you can reasonably do at this point in time. There is also nothing but a small speed penalty to stop the really paranoid from adding a second layer of open source encryption on top of a FIPS 140.2 validated storage device. Such a strategy might even work against state sponsored spies, but we’ll never know for sure! Remember though, any encryption strategy still hinges on your scrupulous use of long and complicated passwords. You must also keep in mind that even the best disk encryption passwords are of no defense to on-line attacks once the machine is powered up and the files opened!

The purpose of this article however is not to dive into details of various encryption standards or teach you how to hide from NSA. My message is simply that you should be striving to encrypt all your data all the time. How you do that is up to you, and your budget – in both time and money. For most people that use Windows, a free software based encryption tool like Truecrypt is sufficient. Unfortunately, Truecrypt is not playing well with Windows 8 yet. If you so you’ve got a Pro version of Windows 8 you’re probably better off using Microsoft Bit Locker instead Truecrypt.  Apple fans using reasonably recent versions of Mac OS can easily protect themselves by enabling FileVault on their computer; though I’m still not so sure I like that part about uploading the keys to Apple? Linux geeks have excellent native software encryption tools and it’s just a matter of taking the time to setup and use them. In short, software based disk encryption is a great option for the majority of people since it’s easy to use and low cost.

There are times however when software encryption isn’t enough or the best solution. For example, maybe you use Windows at work, but occasionally do some work on corporate proprietary data at home on your iMac or Linux box. Here you might consider a hardware encryption solution so you don’t have to worry about software incompatibilities.  There are also cases, when, for reasons beyond your control, it is difficult or impossible to use software encryption in a particular situation. I ran into such a case a while back when I discovered that Microsoft Windows Small Business Server 2011 backup does NOT provide a means of encrypting backups, and it was impossible to use software encryption tools like Truecrypt to resolve the problem. Here again, it was a hardware encryption solution that saved the day!

I can’t profess to have used a lot of hardware encrypted devices, but I can say that I’ve had good success with the ones I’ve used for myself, and some of my clients. For my money, some the easiest to use hardware encrypted devices are from Apricorn. They have a selection of hardware encrypted external USB hard drives, hardware encrypted SSDs, and hardware encrypted flash drives to meet a wide range of needs. I have owned and used one of their FIPS 140.2 validated flash drives for quite a while now and I like it. It’s rugged, and the tiny keypad makes it easy to unlock and use the device on any computer without concerns about software compatibility. If you are in the health field and need a flash drive, these are the ones you want! One of my associates also swears by the Ironkey flash drives. I’ve not tried them myself, but they definitely deserve a look!

A Possible Solution to the The Evil Maid Problem?

This is an aside on the Apricon Aegis Flash drive. When I first learned of these devices I had hoped that they might offer a nifty solution to what is known as the “evil maid” problem. If you are not familiar with the issue, the evil maid problem goes something like this. If you use software based encryption on your computer, you are always left with a small disk slice that is unencrypted. This slice usually only has a small boot loader that gets you to the part where you enter the password for the disk and unlock the disk to start the operating system. If you are away from your computer for an extended period, it is conceivable that someone (i.e. an evil maid) might be put malware on this boot loader and capture your password with out your knowledge the next time you started your machine. The evil maid could return later and have full access to your files. In theory, it is possible to put the boot loader on a USB flash drive and take it with you when you leave. This makes it impossible for the evil maid to load any software since remaining disk slices are all encrypted. The trouble is that you usually don’t want the hassle of taking a bundle of USB flash drives with you whenever your machines unattended. You also could lose them and then be unable to access your machine.

Enter the Apricon with their Aegis hardware encrypted USB flash drive. Eureka!  Put the boot loader section on one of these and they you’re all set. The evil maid is foiled by the hardware encryption and you’re safe to leave the key locked in a drawer while you go to the beach. Unfortunately, it didn’t work out as I had hoped! It turns seems the Apricon flash drive is super sensitive to any changes in voltage level on the USB bus, and automatically locks the device during any hardware disturbance such as power up or a reboot cycle. The end result is that no matter how hard I tried, I could not boot a computer with the Apricorn hardware encrypted flash drive. While this is unfortunate for my plans against the evil maid, it is also probably essential in the design of the device to make sure that it can’t be hacked!

Other Apricon encrypted devices

I have also used the 1 TB and 4 TB versions of the Apricorn Aegis Padlock series of hardware encrypted USB 3.0 hard drives. Again, they have been 100% reliable for me over the course of more than 2 years when used 24/7 for encrypted external backup solutions on one my clients server machines. I note that the basic Padlock series of drives have Military Grade FIPS PUB 197 Validated Encryption Algorithms. This is a very good standard, but if you are under HIPPA, you will want to choose their Padlock Fortress models to get the required FIPS 140.2 validated encryption. If you’re a regular corporate or home user, the cheaper FIPS PUB 197 validated drives are plenty good enough though! In fact, and I probably shouldn’t say this, an Apricorn representative once told me that there is almost no difference in the hardware used in the regular Padlock and the Fortress drives. He went on to say that it cost the company a lot of money and a year of engineering time to push the Fortress drives through the FIPS 140-2 validation. As a result, they must differentiate and charge more for the Fortress models.

Well, that’s enough for now. Again, you’ve got hardware and software encryption options that work, just pick and use them!

If you’re still holding onto Windows XP, you’re not alone!  In fact, market share for Windows XP has actually ticked up slightly in the last month! Despite an EOL (End of Life) deadline looming just 34 days away, the 13-year-old Operating System (OS) still runs an astounding 29.53% share of desktop computers!

If you’re still running Windows XP and you don’t understand all the fuss surrounding EOL, then I suggest that you go here:  http://www.microsoft.com/en-us/windows/enterprise/endofsupport.aspx and get the story direct from Microsoft. They will explain that your old friend XP is now going away and that you need to tithe them additional money for a new version of Windows in order to continue receiving technical support, bug fixes, and essential security updates to your computer. What they won’t tell you is that upgrading from XP on your existing hardware might not be easy or even possible, and if you want to use the latest Windows 7 or 8.1 you’re most likely going to have to invest in a new PC very soon!

For what it’s worth, I posted a warning here some 18 months ago regarding the XP EOL issue. In the article, I provided several suggestions for paths one might take to migrate away from XP. Most of those recommendations remain valid, but Windows 7 is no longer the most current OS from Microsoft. It remains however the one most likely to run on your existing hardware. It is also likely the easiest, for someone accustomed to the XP layout, to pickup and use without a facing a learning curve. If you’re a small business owner, and you must jump from XP because of compliance issues (e.g. HIPPA) then you’re probably about as well off to get on Windows 7 instead of jumping headlong into Windows 8 and it’s radically different user interface. Though some of the Windows 8 user interface shock can be cured by installing Windows Classic Shell it’s still a different beast. Windows 8 has also taken over the retail space so it’s unlikely that you can rush down to your favorite big box retailer and buy a new machine with Windows 7 while you’re on lunch break. You can still purchase machines with Windows 7, but you must look harder to find them. You also have the option of installing Windows 7 from scratch if you want to go that route. While Windows 8 has not been a barn burner for Microsoft, it does have some compelling new features, so now might be as good a time as any to jump onto that bandwagon if you have the cash and time to work the issues. Regardless of whether you choose Windows 7 or Windows 8.1, you will need to plan for some extra time to work out the issues of using and integrating it with your existing systems and processes.

If you’re a casual home user who mostly does e-mail and web surfing, and you don’t want to invest in a new PC or OS at this time, then you might consider trying out Linux instead of the Microsoft offerings. Linux is free, and distributions like Ubuntu are remarkably easy to use. Best of all, they will most likely run just fine on your existing hardware. If you’re not very computer savvy yourself, the switch to Linux can be a bit overwhelming, but my experience has been that anyone, from age 8 to 80, can easily use Linux once it’s properly configured for them. If you’re still running XP and don’t know what else to do, maybe now is the time to have your computer geek friend or neighbor over dinner and see if you can talk them into helping you make the move to Linux?

From a security perspective, I strongly recommend everyone move away from XP, but there are going to be cases where people are compelled to keep using it. If you’ve got a particular software package that only runs on XP or for which you can’t license to run on a new machine, then you’re going to need to find a way to keep using it safely. That means disconnecting it from the network and being super careful  to avoid viruses and malware. One option for doing this is to migrate your XP machine to a virtual machine. This will allow you to continue running XP safely from within a newer and supported operating system such as Linux or even Windows 8. I have done this myself to manage legacy software used in my RF/microwave laboratory and it works great! I use Xubuntu as the host operating system and fire up Windows XP in a virtual machine whenever I need to run my test instruments. With the virtual machine approach, the XP installation is safely isolated from the network and easy to restore from backup if something goes wrong. You’ll most likely want to have at least a dual or quad core CPU in your machine to go that route. 

So, there you have it! You’ve got options and 34 days to do something. Get moving now before it’s too late!

JR

This is Part 2 of my series on upgrading GPU workstations. It is intended for hardware geeks that are interested in building high end GPU workstations for engineering, crypto mining and password cracking. It you’re not on that wavelength you can safely ignore all that follows. If you missed it, please read Part 1 before proceeding.

One step forward and two back

After digesting all the test data from Remcom, I reverted to my initial plan and purchased a new Quado 6000. I figured this was the cheapest possible upgrade and that it would still give a substantial increase in throughput while I worked out the next steps in the upgrade process. Indeed, after installing the Quadro 6000, I saw a nearly 50% reduction in simulation run time on problems that made good use of memory across all three Fermi cards. There was almost no benefit for very small problems that fit easily within the memory of a single GPU. The later limitation was expected and is a result of bandwidth limitation between cards. In short, for small jobs you’re better off to run them one at a time on separate GPU cards. There’s no advantage to using all of your GPU cards on a pint sized problem.

Even though throughput was up 50% after adding the new Quadro, I found that the workstation was no longer stable. I experienced numerous random freezes and X-org crashes that were clearly related to the presence of the Quadro 6000. My first thought was that it must be a video driver issue, so I tried various nVidia driver releases provided by Ubuntu, and ultimately the latest drivers from nVidia, but none of them resolved the issue. I also went through OS upgrades from Xubuntu 12.04 to 13.10 just to be sure that I wasn’t looking at a kernel or x-org issue. The problems persisted. Eventually, I began to think I had a bad Quadro card, since the system was completely stable with the old FX-5800 regardless of the software.  Somewhat late in the process, I ran diagnostics using the nvidia-smi command and learned that the new Quadro was exhibiting ECC RAM errors. At that point, I was convinced it was a hardware problem, so I documented the errors and returned the card to the seller.

I immediately purchased a replacement from another vendor. It arrived the following week and within 10 minutes of installing it, the Remcom solver was again locked up with GPU ECC Errors! I won’t repeat the words here what came out of my mouth at that point, but suffice it to say, I was chapped! I did more tests, and again nvidia-smi confirmed hardware issues! What are the odds of getting two bad Quadro 6000 cards in a row from different vendors?

By this time, I needed to get going with this “easy” upgrade so I ordered yet a third Quadro 6000 and had it shipped over night. As you can imagine, I was crossing a lot more than my fingers when I powered up the system with the third card, but it worked and the system was completely and totally stable. Notably, it was stable with the original Xubuntu 12.04 OS and repository driver.

With the system stable, I queued up a bunch of simulation jobs and then worked to determine what was really going on with the second Quadro card. I installed it in a different system that used the same model Asus motherboard, and a 6-core AMD 1055T CPU. After I reset the ECC errors using the nvidia-smi command, I was amazed to see it run flawlessly even when pushed hard for days at a time. At one point I thought I was losing my mind so I again tried it in the primary workstation just to double check. Sure enough within a few minutes it would lock it up the system! Something was amiss? Why would one Quadro work and a second seemingly identical one crash the system?

A VBIOS issue?

I did some more digging and discovered that the first two cards had VBIOS version 70.00.57.00.02 with build date 04/08/11, while the third card (the one that worked) had VBIOS version 70.00.3C.00.05 with build date 08/03/10. In desperation, I actually attempted to flash the .57 series card with a .3C series VBIOS but the flash program indicated that it was an incompatible VBIOS. I didn’t press further for fear of bricking an expensive card. After further testing of the second card, I became convinced that there was in fact nothing wrong with it. I also decided that the first card (the one I had returned) was also most likely just fine. I contacted the seller and offered to buy it back after admitting my mistake. After it arrived, I retested it, and sure enough, it worked  perfectly in the second workstation!

By this time, I had three Quadro 6000 cards in my hand, so I tried something different. I pulled the Tesla C2070 cards and put in just the three Quadros. Presto! The system was perfectly stable! Wow! What can you say? Apparently, some Quadros don’t play well with some Teslas in the same system, even though they are the same architecture? Go figure? I’ve been unable to find any other references to this situation on the web so if any of you nVidia gurus have insights on the problem please enlighten me via the comment section!

A Quad GPU AMD Motherboard

Once I had a stable system, my next step was to try a quad GPU setup. The cheapest path to this configuration was to replace the Asus Sabertooth motherboard with a  Gigabyte GA990FXA-UD7.  This Gigabyte board is the only AMD motherboard on the market that supports up to four double width nVidia GPUs. In a quad configuration it has 8 PCIe lanes to each card so bandwidth is uniformly good to all the cards. Getting more than that entails an upgrade to an Intel based server motherboard, XEON processor and even more kilobucks of outlay. The motherboard swap was easy, since the AMD FX-8350 CPU and ECC RAM could be re-used in the Gigabyte board. The only downside I noted on the Gigabyte was that it had fewer fan headers than the Asus board. It was otherwise a drop-in replacement with a better PCIe slot layout. Please note that I do NOT do any over-clocking so my evaluation will likely differ from those looking at it for gaming purposes.

With the system rebuilt around the new motherboard, I powered it up with three, and then fully four GPUs. As before, I noted a stable system when using the Tesla C2070 cards along with the Quadro 6000 having VBIOS 70.00.3C.00.05.  Likewise, everything worked fine when using all of the Quadros together. Crashes were still evident when I mixed the C2070 and newer Quadro cards in the system. To get a working quad GPU system I was forced to buy yet another Tesla C2070! Ouch!

ECC Testing?

As an aside, I note that I deliberately purchased revision 1.x of the Gigabyte board, rather than the newer revision 3.0. I opted for the older version because I had read on various forums that it offered unofficial support for ECC memory. I did not know this at the time, but it is apparently rather difficult to actually confirm that ECC RAM function is working properly on any motherboard. This article suggests several avenues for confirming ECC operation. Using it as a guide, I can confirm that revision 1.x of the GA990FXA-UD7 with firmware update F10 does the following:

  1. The board functions fine with unbuffered ECC RAM whether ECC function is enabled or not.
  2. The board recognizes the presence of ECC RAM, and the BIOS functions to control it are operational. The Gigabyte options are actually more extensive than those offered by Asus which simply offers and enable function.
  3. The memtest 86+ revision 4.x indicates that ECC is present, but revision 4.x apparently had issues with proper detection of ECC function so I’m not sure if that is to be trusted.
  4. ubuntu dmi-decode is inconclusive and indicates that ECC detection is present, but correction is not and that memory is 64 bit versus the 72 bit length as reported by ECC enabled Asus Sabertooth board.
  5. ecc_check.c is inconclusive
  6. I have not yet tried to block traces on my working ECC modules to test the function.

In short, I think ECC is working, but I’m not 100% confident in that assertion. If any of you have input or can confirm 100% that ECC works on these boards I’d be interested to hear from you.

Power Conditioning

I use a UPS / Battery Backup on all my computers. I had been using an old APC Back-UPS RS 1500 with extended battery pack on my GPU workstation. Rated at about 900 Watts it could handle the triple GPU setup without difficulty. Adding the fourth card and starting a simulation on all of them at once immediately kicked off the overload alarms so I had to search for something bigger. My research lead me to purchase the Tripp-Lite SU1500RTXLCD2U 1500VA 1350W Smart Online Rack mount UPS. This is the largest wattage UPS I could find that uses the standard NEMA 5-15 plugs that are common in most homes and businesses. Anything larger and the upgrade is likely going to implicate the services of an electrician. The run time is necessarily short ( about four minutes) at full load, but it can be increased, if necessary, by purchasing additional batteries.

I like this unit. It’s very solid and pretty much plug and play. While it’s not exactly silent, the fan is not overly loud as apparently some of these higher power units can be. If you’re running a quad GPU machine, the UPS fan is likely the least of your auditory worries however! I note that this is a “smart online” model, meaning that it’s constantly generating the power signal from the battery. This is different from cheaper line interactive models that switch on and off battery according to the incoming power signal quality. Even though something like Tripp-Lite SMART1500RMXL2UA 1500VA 1350W Rack mount UPS would likely have been fine, I opted for the online model so that I’d have the best protection for the expensive GPUs.

Cooling and 4U Rack mount case

Along with the change in motherboard, I also decided to try a new rack mountable case:  The Chenbro RM41300-FS81 is a 4U server chassis designed specifically to hold and cool up to four Tesla GPUs. In fact, it’s the only rack mountable case I have been able to find which is capable of doing this! As far as I know all other cases that support quad double width GPUs are of the full tower form factor such as the Cooler Master HAF-X. While there’s nothing wrong with the tower cases, I have a lot of rack mounted test equipment in my RF/microwave lab, and I’ve come to see the advantages of going that route. There are also options for open air frames with no sides, but I prefer to have equipment shielded and enclosed if possible. For now, I am keeping that as an option only if I find cooling to be a problem.

I’ll have to admit that I was skeptical that the smaller 4U case would cool adequately, but so far it’s been just fine. Fan speeds on the Tesla cards range from about 50% to about 70% when under load, with the middle cards running harder due to restricted airflow and heat from adjacent cards. Temperature on the Fermi cards always seems to hover around 89C when under load.  Interestingly, the newer Quadro cards seem to run as much as 20 C cooler than the C2070 cards when not loaded. Fan speeds and temperature under load still seem to reach the same levels however. The real test of cooling will come in July when it’s 105 in the shade and the central air is straining to keep the ambient to 80 F.  I’ll keep you posted on how that works out!

As far as noise is concerned, the fans in the Chenbro are definitely louder than the HAF-X. I wouldn’t want a Chenbro in my living room, but it’s ok for a garage lab. Beyond that, I really like the Chenbro. The quality and workmanship are first-rate. My only minor complaints would be the lack of external ports, and the blindingly bright blue power LED. The former can be remedied with after market panels in one of the drive bays. The later, I fixed with a snip of black electrical tape on the LED lens. I neglected to take photos when assembling my build, but you can see some good photos here if you’re interested in this exceptional case.

The work on the next part of this article is still in progress, but I’m aiming to set up a pair of multi-GPU workstations and use MPI (Message Passing Interface) to get it all working on a single problem. Until then…

Thanks for reading!

JR

Update on ECC Testing May 19, 2014

I had the Gigabyte 990fxa-ud7 motherboard off line for several months. I recently put it back in service and circled back to investigate the ECC issue. I found a reference here showing another way to confirm ECC operation by using linux edac. The command line output shown below seems to confirm that the board does have ECC function enabled.

~$ dmesg | grep -i edac
[   17.274327] EDAC MC: Ver: 2.1.0
[   17.275730] AMD64 EDAC driver v3.4.0
[   17.335091] EDAC amd64: DRAM ECC enabled.
[   17.335103] EDAC amd64: F15h detected (node 0).
[   17.335165] EDAC MC: DCT0 chip selects:
[   17.335167] EDAC amd64: MC: 0:  2048MB 1:  2048MB
[   17.335168] EDAC amd64: MC: 2:  2048MB 3:  2048MB
[   17.335170] EDAC amd64: MC: 4:     0MB 5:     0MB
[   17.335172] EDAC amd64: MC: 6:     0MB 7:     0MB
[   17.335173] EDAC MC: DCT1 chip selects:
[   17.335175] EDAC amd64: MC: 0:  2048MB 1:  2048MB
[   17.335177] EDAC amd64: MC: 2:  2048MB 3:  2048MB
[   17.335178] EDAC amd64: MC: 4:     0MB 5:     0MB
[   17.335180] EDAC amd64: MC: 6:     0MB 7:     0MB
[   17.335182] EDAC amd64: using x4 syndromes.
[   17.335184] EDAC amd64: MCT channel count: 2
[   17.335205] EDAC amd64: CS0: Unbuffered DDR3 RAM
[   17.335205] EDAC amd64: CS1: Unbuffered DDR3 RAM
[   17.335205] EDAC amd64: CS2: Unbuffered DDR3 RAM
[   17.335206] EDAC amd64: CS3: Unbuffered DDR3 RAM
[   17.335265] EDAC MC0: Giving out device to ‘amd64_edac’ ‘F15h': DEV 0000:00:18.2
[   17.335443] EDAC PCI0: Giving out device to module ‘amd64_edac’ controller ‘EDAC PCI controller': DEV ‘0000:00:18.2′ (POLLED)

The command edac-util -v shows that there are as yet no ECC errors. That’s good!

~$ edac-util -v
mc0: 0 Uncorrected Errors with no DIMM info
mc0: 0 Corrected Errors with no DIMM info
mc0: csrow0: 0 Uncorrected Errors
mc0: csrow0: ch0: 0 Corrected Errors
mc0: csrow0: ch1: 0 Corrected Errors
mc0: csrow1: 0 Uncorrected Errors
mc0: csrow1: ch0: 0 Corrected Errors
mc0: csrow1: ch1: 0 Corrected Errors
mc0: csrow2: 0 Uncorrected Errors
mc0: csrow2: ch0: 0 Corrected Errors
mc0: csrow2: ch1: 0 Corrected Errors
mc0: csrow3: 0 Uncorrected Errors
mc0: csrow3: ch0: 0 Corrected Errors
mc0: csrow3: ch1: 0 Corrected Errors

At this point, I’m satisfied that ECC is enabled and working on the board. The situation with dmi-decode incorrectly reporting memory characteristics is apparently not unusual with AMD boards.

 

This post is intended for hardware geeks that are interested in building high end GPU workstations for engineering, crypto mining and password cracking. It you’re not on that wavelength you can safely ignore all that follows.

Background

I’ve previously reported on my experience building a GPU workstation that I use in my consulting engineering business for analyzing antennas with a Finite Difference Time Domain (FDTD) code from Remcom. Even though I already have a very capable workstation with several high end nVidia Tesla and Quadro graphics cards, there is always pressure to improve performance and do ever harder problems. In particular, last year I took on a project that involved analyzing and designing antennas that are used inside the human body to kill cancerous tumors with microwave energy. Modeling these antennas has been particularly challenging and simulation run times have soared as a result. I’ve thus been researching hardware options to increase throughput and I thought I’d document them here in hopes they might help others evaluating similar choices for their GPU applications.

My current system uses an 8-core AMD FX8350 processor and 16 GB of ECC RAM on an Asus Sabertooth 990FX motherboard. The GPUs consist of one nVidia Quadro FX5800 and two nVidia Tesla C2070 GPU cards. (For those that didn’t read my earlier post, I point out that run time with X-FDTD is almost entirely dictated by the GPU so there is no advantage to using a faster CPU. The AMD 8-core is more than sufficient in this application and vastly cheaper than equivalent an Intel Xeon alternative. Also, the Remcom software only works with nVidia cards, so any arguments for switching to AMD GPUs are moot.) I have a separate RAID file server so the GPU workstation has only a single Intel SSD to run the OS and hold simulation programs and data. To date, it has functioned well with the Remcom software and has been remarkably fast and reliable. The only slight downside was that the Quadro and Tesla cards did not share the same nVidia chipset architecture. This was unavoidable since I purchased the older Quadro before the Fermi based cards came on the market. As a result of the architecture difference, and limitations of the Remcom software, I was unable to utilize all three cards concurrently on the same problem. I could however launch one simulation on the C2070 cards and a second smaller simulation on the Quadro, so it wasn’t a big deal until I started the cancer work and faced considerably more and bigger problems.

nVidia Fermi or Kepler Architecture?

Because of the mixed GPU architecture issue, one of the first upgrade options that I considered was simply to swap out the FX5800 and replace it with a newer Fermi based Quadro 6000 to match the architecture of the C2070s. That looked like an easy and relatively cheap way of getting an immediate 50% speed up on big problems. Additionally, it would also allow me to increase the size of the problems I could solve by about 50% due to the added 6 GB of video RAM on the Quadro 6000. 

All this seemed like a good idea until I went to the nVidia website and was presented with the very latest GPU technology known as Kepler and embodied in such wonders as the Quadro K6000 and the Tesla K40. Wow! I could hardly believe my eyes!  These cards have an astounding 2880 CUDA cores and 12 GB of RAM! On paper, nVidia’s specs make it appear that a single K6000 could replace four or maybe even five C2070s in terms of compute power and a pair would positively blow the doors of my existing setup. The investment would be steep, roughly $5k per card, but I figured if I could get 5x speedup over my existing setup that an investment of maybe $15,000 or more would be well worth it to advance my engineering work.

Before I plunked down that kind of cash however I needed some proof that those claimed improvements would actually translate to a commensurate speedup on the X-FDTD software. I contacted Remcom to see if they had any data for their software running on the Keplers. As luck would have it, last year one of their engineers wrote an internal research paper looking at scaling of their algorithm on a cluster of various nVidia cards. While they didn’t have data for the latest K40, they did have data for the recent K20x and many others going back to the older Fermi based C2050/C2070 cards. The paper was chock full of data showing how well their algorithm scaled as the number of GPU cards was increased both on a single machine and in a cluster. Note that I said, number of GPU cards and NOT number of CUDA cores! The most telling plot in the whole paper showed total simulation time for the same typical size problem run on the varying numbers of the various types of nVidia cards. While newer cards like the K20 and K20X were clearly faster than older cards, they were definitely not faster in proportion to the number of CUDA cores!

The Remcom benchmarks showed that moving from a C2090 to a K20 only reduced run time by about 5% on an average test problem. Surprisingly, they pointed out that the C2070 was the clear winner from a value perspective because it had the same amount of RAM and was only slightly  slower than the C2090, but much less expensive. Figuring the K40 to be only a wee bit faster than the K20X I estimated that I might see ~30% reduction in run time with a K40 versus one of my old C2070s. That sounds good, but no where near the 5x speed up touted by nVidia! This is definitely a case where spending 5x the money doesn’t get 5x the performance.

Now, don’t get me wrong, I’m not saying that nVidia is lying when they claim that the K6000 is 5x as fast as a C2070. I’d bet it is every bit that fast on some algorithms. Unfortunately, there must be something about the FDTD algorithm, or at least the way it’s coded by Remcom, so that it doesn’t make good use of the extra 2432 cores the K6000 has over the C2070. Alas, I had no excuse to burn cash on the latest nVidia wonder cards.

Going back to the Remcom technical paper, it was clear that throughput scales almost linearly as the number of GPUs increases. This scaling was even possible with a GPU cluster through the use of MPI (Message Passing Interface) that Remcom had recently incorporated into their code. It seemed that my best option was to buy as many Fermi cards as I could reasonably afford and then work out the networking, power and cooling issues involved in a small GPU cluster. I’ll dive into some of that in the next part of this series. Until then, I think it’s clear that you have to look deeper than the marketing material offered by nVidia and AMD in order to judge whether their latest GPU marvels will really pay off in your particular application. Test data on your software is all that matters!

Thanks for reading!

JR

Update Mar 7, 2014

Here’s the link to Adventures in GPU Upgrades – Part 2

Update May 30, 2014

Please see my post here for the latest on the TrueCrypt debacle.

 

Update May 28, 2014

The Truecrypt website is in a state of flux. We do NOT recommend using Truecrypt at this time. Please check back often until we have more clarity on the Truecrypt situation.

 

Original Post

Here at Security Beacon, we generally recommend that everyone use some form of disk encryption, preferably full disk encryption (FDE), to protect the data residing on their computer hard drives. For users of Microsoft Windows operating systems we’ve specifically recommended Truecrypt because it is free, open source, and very easy to use. Indeed, the latest Truecrypt is so good that there has literally been no excuse for Windows users NOT to encrypt their hard drives! Unfortunately, Truecrypt has not kept up with the latest changes being forced on the industry from Microsoft. As a result the software is somewhat out of date and is not 100% functional on new machines that are shipping with Windows 8.

At this time, the most recent release of Truecrypt is revision 7.1a, from Feb 7, 2012. The web site indicates that full support for Windows 8 is to be implemented in a future version, but as yet no sign of when, or if, such a version shall be released. One can only hope that donations will continue to the site so that they can continue their work updating the code for the newer operating system. In the interim, most Windows 8 users should consider other options as outlined at the bottom of this article.

Note that I said “most” users of Windows 8! There are some sources on the web that indicate the critical Truecrypt function of encrypting a system partition is actually functional under Windows 8 provided that the hard drive is formatted with the old MBR (Master Boot Record) partitioning scheme rather than the newer GPT (GUID Partition Table, where GUID means Globally Unique ID) scheme that is required on new machines shipped with Windows 8 already installed. If you are using Windows 8 on a system that was upgraded from earlier versions of Windows then your disk may have been formatted with MBR so it may be possible to successfully use Truecrypt to encrypt your system partition. You may also be in luck if you’ve installed Windows 8 from scratch, and, through astute foresight or dumb luck, formatted the drive as MBR instead of GPT. Regardless of the formatting of your disk, you can of course continue to use Truecrypt under Windows 8 to safely encrypt files and folders on your hard disk. You just can’t encrypt the system partition; as is preferred in most situations. If any of the discussion on disk partitioning seems foreign to you, wait until a new version of Truecrypt is released before attempting to encrypt your Windows 8 system system partition.

Other options:

It seems the change to GPT has also affected a number of other disk encryption programs besides Truecrypt. A web search reveals that Diskcryptor is also incompatible with Windows 8. Apparently, even Symantec / PGP also had some issues initially, but their newest releases do support Windows 8. The only bright side to the Truecrypt issue is that starting with Windows 8.1 hard drives will be encrypted by default provided you’ve got the right hardware. See also this excellent ars technica article for more details. The downside to the default encryption method is that you must share the encryption keys with Microsoft. I’m sure that’s a non-starter for many readers after last year’s Snowden revelations! If you don’t have the right hardware, you might consider an upgrade to Windows 8 Pro or Enterprise to get the BitLocker feature as an alternative. The Truecrypt situation is unfortunate, and, for some people, it is likely reason enough to delay, or avoid entirely, an upgrade to Windows 8.