Monday, December 22, 2014

The system has reached the maximum size allowed for the system part of the registry

The system has reached the maximum size allowed for the system part of the registry. Additional storage requests will be ignored

If you get this error on a machine it's a major annoyance. There are plenty of potential issues as to why this may be happening, but I'm not going to get into those. This article is going to assume that the registry truly has reached it's breaking point through normal means. Also, as a disclaimer, messing with your registry without understanding what you're doing is very dangerous so if you do continue with this process be careful. This article is my version of MS KB 2498915

The trick to reducing the size of your registry files is to open them within regedit without having your main OS running and exporting them. You can do this with a WinPE boot disk if the device is a physical machine. In my case the server was actually a VMWare ESXi client, and the WinPE disk wouldn't boot. In that case what I did is shut down my server and open the virtual hard drive on another ESXi client to gain access to the files.This created a new problem, but it was fixable. Make sure to check that link beforehand if you'll be doing the same in a ESXi environment.

The files you need access to are in the C:\Windows\system32\config directory: Software, System, Default, etc. These are your registry files. Whether you boot from WinPE or use a 2nd computer, those files remain the same. However, you can look at those registry files to see what their sizes are. Whichever one is the largest is the one I would suggest starting with. In my example I'll assume you're shrinking the software hive too, but you can repeat these steps for any of the registry hive files.

In my case, the software file was 3-4x larger than any of the others so that's where I suspected I had the most free space to reclaim. Once you have access to the registry files through WinPE or a 2nd computer, open regedit.

Once regedit is open, click on HKEY_LOCAL_MACHINE to select it and then go to File->Load Hive... This will open a file dialog box, and here you'll want to navigate to the registry files directory from above and open the hive, which in this case we're using software. You will be prompted to give it a temporary name, which you can use whatever you'd like to refer to it. In my case I used the name toshrink.

After the hive has loaded, select it and go to File->Export... and save it somewhere. Do not overwrite the existing hive, just in case you need it again. Something like softwarecompressed would be a good name. At this point you'll have to let regedit run, and it could take a while. My software hive was ~180MB and it ran for somewhere between 30-60 minutes before completing. Smaller will run faster of course.

After the export process has completed, close regedit and then reopen it. Once it is open again, click the hive you loaded in HKLM and go to File->Unload Hive... This will unload it from the registry.

Now go to the location of your newly compressed hive file. In our example we have softwarecompressed, which should be smaller than the software hive we started with. Rename software to software_orig, and rename softwarecompressed to software. Place the compressed and newly renamed file into the C:\Windows\system32\config folder and then reboot your machine normally. This should load the system with the compressed hive file, and get rid of your maximum size allowed error.

For me the software hive was 192MB, and after this process it was 133MB. The other hives didn't shrink enough to make it worthwhile, but the compression on the software hive was significant.

Good luck


Wednesday, December 17, 2014

VMWare error: The parent virtual disk has been modified since the child was created

I needed to shrink the registry files on a ESXi guest (guest1), and in order to do so I shut it down and mounted the virtual disk on a separate ESXi guest (guest2). Once it was done, I removed the temporarily shared virtual disk from guest2 and went to boot guest1. I received an error that said "Cannot open disk: .... The parent virtual disk has been modified since the child was created". I then tried reverting to a snapshot and that failed, so for a minute I thought I may have just lost one of my servers. Fortunately this is something that can be fixed, and it wasn't too difficult.

The ESXi guest's virtual hard disk can consist of multiple files. What was happening is that the ID tag(s) saved in those files to tell the system which order to align those files in to create the complete hard disk were getting changed when I mounted it on guest2. To fix the problem I needed to sort through those virtual disk files and correct any mismatched ID tags. Here's how I did that

First, I had to enable SSH access to my ESXi host. You can also get in through the ESXi CLI too if you'd prefer. To enable SSH access to an ESXi host, you can do it from the direct console or from the vSphere Client. From the direct console, log in and go to Troubleshooting Options->Enable SSH. To enable it through the vSphere client, open the host and go to Configuration->Security Profile. Then in the Services section, click Properties, click on SSH, then Options. From there you can set the startup policy, and also start the service.

Once I had SSH access enabled, I logged into the host holding guest1. Now you want to get into the datastore where the virtual hard disk files are stored for the guest, and navigation commands are the same as you'd use in Linux. For me it was cd /vmfs/volumes/datastore1/guest1. Once I was in that directory I could use the ls command to list all the files. What I needed where the guest1.vmdk, guest1-0000001.vmdk, guest1-000002.vmdk, etc files. For my machine, I had guest1.vmdk, and then three add-ons, so four total virtual disk files to look at.

At this point you'll want to use your favorite text editor to open these files. Personally, I use vi, but you can use whatever you'd like that is available. Open each of the virtual disk files and note the value at the top for CID and parentCID, then close and move to the next. The parent CID of guest1.vmdk should be something like ffffffff. Once I had opened each of my files I had this


guest1.vmdk
CID: 32b76102, parentCID: ffffffff

guest1-0000001.vmdk
CID: 7d3d984f, parentCID: fa1f4813

guest1-000002.vmdk
CID: 49eb6c66, parentCID: 6e1b350e

guest1-000003.vmdk
CID: fa1f4813, parentCID: 49eb6c66

Now is where you get to solve the puzzle. Which CID or parentCID is incorrect and screwing up your virtual hard disk? I had to draw it out, but what I ended up with is

6e1b350e (???)<-49eb6c66 d3d984f="" fa1f4813="" guest1-000001="" guest1-000002="" guest1-000003="" p="">
For me it was that the parentCID value on guest2 was pointing to an unknown CID. Once I found that out, I opened guest1-000002.vmdk in vi and changed the parentCID value to the CID value of guest1.vmdk, 32b76102. Saved and closed, then booted up guest1 without any other problems.

I did try it a second time just to see, and in that case the same thing happened on the same file. It looks like when I mount the virtual disk in guest2, the CID value on the primary vmdk file gets changed. All the others stay the same, so if you can find the parentCID value that is going nowhere and change it, then you're back in business.


Articles I found that helped me come out with the solution

VMWare KB about hard disks failing to open

VMWare KB about resolving CID mismatch on virtual hard disks

Enabling ESXi Shell or SSH access to ESXi host

Editing VMWare config files

Monday, June 30, 2014

Install Symantec's Backup Exec 2010 Mac OS X agent in OS X 10.9 or 10.10

Updated: 10/23/15

I reinstalled the OS on my OS X server and upgraded to 10.10 Yosemite. In that process I had to reinstall the Backup Exec agent, and I'm still running 2010. It was quick and simple, even though I ran into the same error about Switch.pm missing. Since it's been so long I looked it up rather than referring back to this post, and found an easier command than below on this blog. In Terminal, run sudo cpan -f Switch. You'll likely get a message that says something about your system needing XCode installed, and you'll have to select Install. Once that finishes, re-run sudo cpan -f Switch and let it autoconfigure itself. Once done, go ahead and run the Backup Exec Agent installer and any service packs you have for it and it should work just fine. Make sure to start the agent when you're finished.

Original Post:

If you're like me and hated the newer interface Symantec introduced in Backup Exec 2012, you may still be running Backup Exec 2010 instead. Now, if that's the case and you introduce a new Mac running OS X 10.9 Mavericks into your list of machines to backup you're going to run into a problem. The BE2010 agent will not install in OS X 10.9 and will give an error message instead about not being able to locate Switch.pm. Before you jump to the conclusion that you need to upgrade Backup Exec to a version compatible with OS X 10.9, keep reading.

I ran across this today with a new OS X 10.9 Mac Mini server. It turns out that in Mavericks the Switch.pm has been deprecated from the version of Perl that comes with the OS, and the RAMS agent installer relies on that to run. It also turns out that adding it back to the OS is a pretty simple process, and once done the RAMS installer runs just fine. Here's what I did

1. Open a Terminal window on the OS X server
2. Run the command cpan install
3. Follow through the on-screen setup for CPAN (Comprehensive Perl Archive Network). I used the defaults
3a. During the install you will be prompted by the OS that you need to install make in order for the installer to continue. Click OK and let make install
4. Now CPAN should be installed so start it with sudo cpan. You need sudo access to install the Switch.pm file
5. You should see cpan (1) >. Type install Switch and press Enter. This will install the missing module back
to your Perl library
6. Once that's done type quit and press Enter to get back to the main Terminal

Now you should be able to run the RAMS installer package normally, and this time it should work.

Before doing this though, you should really ask yourself if you should. It seems to be working fine for me, but the Switch module was deprecated and left out for some reason. I make NO guarantees about this process, other than that it worked for me. If you have a test network available, please make sure to utilize that first before making this change in your production environment. Like I said, this was figured out the same day that I posted it, so I haven't done enough testing to say that it won't adversely effect the system, RAMS performance, or both.

This is the original article I came across while trying to get the RAMS installer to work that led me to write this article.

Monday, June 9, 2014

Free up inactive memory in OS X

Let me preface this by stating that normally you should not have to do this, and it is typically best to let the OS manage the RAM usage. I suggest you use this method only when necessary rather than as a standard practice. I'm also making an assumption that your familiar with the Terminal app, primarily in the situation where you want to schedule this. If you are not, you may need to do some research on Terminal outside of this post.

If you're like me, you've had an issue on an OS X server and when you check it out there is little to no free RAM available but you have a bunch stuck in the state known as inactive memory. This typically indicates one of two things. Either you need more RAM to run your server applications effectively, or something is causing a memory leak. In my case it is a little of both. However, running out and picking up Apple Server memory on a whim isn't always an option. Tracking down a faulting program isn't always easy, or quick either. If you're struggling to keep an OS X server accessible while waiting for an upgrade window or to give yourself time to troubleshoot, or if you're having problems with the personal Mac's memory allocation, this may be a temporary workaround you can use.

There is a command named "purge" that will free up the inactive memory. You can read more about it on the purge man page. You can simply open Terminal and issue the command, then press Enter. You may need to use sudo purge, but nonetheless you can invoke purge and free up some RAM immediately.

However, if this is a server or continues happening, you may want to script this command and call it on a schedule. While waiting for a shipment of RAM for a couple OS X servers having this issue, I used this script provided by Daniel Payne on stackoverflow.com in this article.

#!/bin/bash
free=`vm_stat | grep free | awk '{print $3}'`
freer=${free%%.*}
if [ "$freer" -lt "18000" ]
then
    nice purge
fi

I opened Terminal and used vi to create the script file, but you should be able to use Textedit or any text editor you'd like. Just make sure to name it with the extension of .sh at the end, so your file should be named something like freeRAM.sh. What this script does is run the purge command if there are less than 18000 free memory pages available. You will want to modify that value to match the minimum amount of memory you will accept before running the command. This eliminates unnecessary running of the purge command. Remember that this is the value in memory pages, not B/KB/MB/GB. If you don't know page size your system is using, you can run vm_stat within Terminal and it will tell you in the first line. For my server I was using the default page size of 4096 bytes. This means that if I wanted to run the purge command if there was less than 200MB of free memory, I would need to substitue the 18000 in the script above with

200MB * 1024KB/1MB * 1024B/KB * 1PG/4096/KB = 51200 (instead of 18000)

Once you have your shell script saved you need to add it to the schedule. You can use cron, but it appears to be deprecated in newer version of OS X so instead we'll use launchctrl. I used the guide found here to create my .plist file and get it into the launch daemons schedule.

You can also create your script using Automator and then schedule it using iCal. However, I wanted to run the script multiple times a day, and it appears that using iCal allows once a day as the most frequent option.

Monday, May 19, 2014

OS X Active Directory Users losing admin privileges when offline

For anyone using Directory Services in OS X to bind the Mac to a Windows domain, you've likely seen the option to allow administration by..., where you can define groups to administer the machine. I have a security group setup in Active Directory specifically for this, and whenever I bind the Mac to the domain I add that group and turn that option on. However, once in a while, when a machine is not able to directly authenticate with an Active Directory server, domain users do not have local admin rights. Typically admin rights come back the next time the machine is able to communicate with Active Directory, but in the meantime it is an annoyance while offline. Fortunately, it appears that I'm not the only one who has been dealing with this. I only wish I had spent some time researching it sooner.

Previously, my workaround to this problem has been to remove the Mac from the Active Directory domain, and then rejoin. While this has worked, it is just a workaround rather than a solution. It appears that someone with the same issue has found the actual problem, and also posted the solution. What is apparently happening is that even though those groups are supposed to be allowed to administer the computer according to the setting in Directory Services, the accounts are not added to the local admin group on the Mac. You can fix this by opening a Terminal session, and running the following command:

dseditgroup -n /Local/Default -o edit -u localUsername -p password -a accountToAdd -t user admin

*UPDATE*

Rather than use the above command, I found simply using sudo removes the need for the -u and -p switches so you can use the following.

sudo dseditgroup -n /Local/Default -o edit -a accountToAdd -t user admin

-n = node
-u = local username used to authenticate to make the change
-p = password for user defined with -u
-a = name of account to add to the admin group
-t = type of account you're adding
admin = group name

You'll want to use your own information for -u, -p, and -a. -t can take group as an option (instead of user). I haven't tried that yet, but it should allow you to add an entire security group to the local admin group in case you have multiple users for that one machine

Now, I believe this may do the same thing as well if you're not comfortable using Terminal to issue that command. You'll need to have login info for an actual local admin account, and the domain account you want to grant admin rights to must have logged in to the machine at least once already. Simulate being offline by turning off the wifi connection and disconnecting any LAN cable(s). Once you're offline, go into System Preferences->Accounts, click the user that should have local admin rights and check the box that says "Allow user to administer this computer". Then reconnect your network connection and reboot.

The two articles I found related to this that I used are:

https://discussions.apple.com/message/16026679#16026679

https://discussions.apple.com/message/22540531#22540531




Wednesday, May 14, 2014

ProcessExplorer "Unable to extract 64-bit image" error

The ProcessExplorer program is a very useful utility. I needed it today to track down a file lock, but upon trying to run it on my Windows 7 64-bit machine, I kept getting an error telling me "Unable to extract 64-bit image...". A few Google searches mentioned this being caused by a permissions error, but this didn't make sense since I'm an admin on the machine. After I ran across this on the SysInternals forum, then I realized that those saying that permissions were the problem weren't wrong, but that their answer wasn't specific enough.

Upon running the ProcessExplorer executable, it will extract the 64-bit version of the program to the AppData/Local/Temp folder and attempt to run from there. However, if you're like me and have restrictions on applications running from the Temp folder, this will cause the error. To get around it I simply navigated to my temp directory and move the procexp64.exe file to my Desktop and executed it from there. It opened right up and I was able to get back to what I needed ProcessExplorer for.

By default the AppData directory is hidden. The quickest way to get there is by clicking Start, then type %tmp% into the Search box and press Enter. Or type the path directly into the navigation bar, or choose to show hidden files.

Full path to the temp folder is C:\Users\"your username"\AppData\Local\Temp

Thursday, March 20, 2014

Toshiba Mobile LCD and IE 10 or 11 crash

I have one user with a Toshiba Portege ultrabook, and it's been fine. He also wanted a Toshiba mobile LCD screen to bring along for more screen real estate. Eventually his machine started having issues with Internet Explorer, where IE would crash immediately upon opening. I assumed it was a corrupt IE install since IE 10 had just came out, but uninstalling/reinstalling IE 10 didn't help. IE 9 worked fine though. After going through plenty of additional troubleshooting I finally found the cause of the issue. It was the DisplayLink driver that was installed with the mobile LCD screen. When that driver was installed IE would stop working, and as soon as it was uninstalled IE would go back to normal. Luckily DisplayLink had newer drivers available to download, and the driver worked fine with both IE 10 and IE 11.

For anyone scratching your head trying to find a solution to IE 10 or 11 crashing, if you have a portable LCD screen in use that uses DisplayLink drivers, try updating those first. It'll save you plenty of head scratching and unnecessary malware scans.

Friday, March 14, 2014

Configure VLAN(s) and enable routing on an HP Procurve switch

If you're running a managed HP Procurve switch and want to take advantage of VLANs to subnet your network, it's pretty easy. Here's a diagram of my example



In this example we have two VLANs (VLAN 1 and VLAN 2). VLAN 2 is setup just for workstations and must connect to VLAN 1 for DHCP, DNS, and Internet access.

In case the image is too small, on the switch, ports 1-24 are being designated as part of VLAN 1, and 25-48 are part of VLAN 2. VLAN 1 is the 192.168.10.0/24 subnet, and VLAN 2 is the 192.168.20.0/24 subnet.

On my primary and secondary DNS/DHCP servers, I have a DHCP scope setup for the primary network (VLAN 1), and another scope setup for VLAN 2. In my DHCP options for both, I set the primary DNS server to 192.168.10.10, and the secondary to 192.168.10.11. For VLAN 1, I set the router to 192.168.10.1, but on VLAN 2 I set the router to 192.168.20.254 since the default gateway needs to be found within the same subnet.

To actually set this up, first, you would telnet into your Procurve switch, which I'm hoping you know how to do if you're going to attempt setting up a VLAN. You'll need enable access on the switch as well. Once you've logged into the switch and are at the terminal, here is what I would enter to set up the above example. I've added comments/explanations on all lines, so be aware that you do not want to enter the - (.....) from the lines into the terminal window

enable - (enables admin access)
conf t - (enters configuration mode using the terminal)
ip routing - (enabled IP-based routing, which is required to allow the two VLANs to communicate)
vlan 1 - (will enter the configuration mode for vlan 1, which should exist by default on the switch)
untag 1-24 - (untags ports 1-24 on the switch to indicate they're going to be restricted to vlan 1)
ip address 192.168.10.254/24 - (assigns the IP address of 192.168.10.254 to the VLAN 1 interface)
vlan 2 - (will create vlan 2 if it doesn't already exist, then enters configuration mode for it)
untag 25-48 - (untags ports 25-48 on the switch to indicate they're going to be restricted to vlan 2)
ip address 192.168.20.254/24 - (assigns the IP address of 192.168.20.254 to the VLAN 1 interface)
ip helper-address 192.168.10.10 - (sets VLAN 2 to send DHCP packets to the primary DHCP server)
ip helper-address 192.168.10.11 - (sets VLAN 2 to send DHCP packets to the secondary DHCP server)
ip route 0.0.0.0 0.0.0.0 192.168.10.1 - (sets the default route to the default gateway in VLAN 1)
write mem - (commits the changes you made to the configuration stored in memory on the switch)
end - (exits configuration mode)
exit - (exits enable mode)
exit - (logs you off from your telnet session)

The one issue I ran into when I first did it is I had "ip default-gateway 192.168.10.1" set on my switch and thought that was good enough for my VLAN 2 to get to the Internet. However, that is only effective when ip routing is disabled, and for the VLANs to communicate ip routing needs to be turned on. That requires you to add an actual static route, or use ip default-network if it's an available option. For more information on that see this link. It's from Cisco, but the same applies to the Procurve devices. That link explains the differences between the default gateway options, and what routing protocols are affected by each.

The one thing I didn't touch on here is setting your actual routing to be able to reach VLAN 2. For that you'll have to decide what is best because it depends on your network and routing devices and protocols in use. In my example, I need to setup a route in VLAN 1 that would send traffic for 192.168.20.0/24 to 192.168.10.254 (the switch's IP on VLAN 1). If you have OSPF configured on your network and your switch participates, then you likely have nothing to do here. For my network, the switch doesn't support OSPF and the router is managed by my ISP and I have no access. In order to get traffic to VLAN 2, I added a static route to my firewall for it. That way it still gets advertised over OSPF and VLAN 2 can be reached.

When I initially decided to do this, I used a few articles to come up with the final configuration. In case they may be helpful to you:


Happy VLANing!