Wednesday, December 17, 2014

VMWare error: The parent virtual disk has been modified since the child was created

I needed to shrink the registry files on a ESXi guest (guest1), and in order to do so I shut it down and mounted the virtual disk on a separate ESXi guest (guest2). Once it was done, I removed the temporarily shared virtual disk from guest2 and went to boot guest1. I received an error that said "Cannot open disk: .... The parent virtual disk has been modified since the child was created". I then tried reverting to a snapshot and that failed, so for a minute I thought I may have just lost one of my servers. Fortunately this is something that can be fixed, and it wasn't too difficult.

The ESXi guest's virtual hard disk can consist of multiple files. What was happening is that the ID tag(s) saved in those files to tell the system which order to align those files in to create the complete hard disk were getting changed when I mounted it on guest2. To fix the problem I needed to sort through those virtual disk files and correct any mismatched ID tags. Here's how I did that

First, I had to enable SSH access to my ESXi host. You can also get in through the ESXi CLI too if you'd prefer. To enable SSH access to an ESXi host, you can do it from the direct console or from the vSphere Client. From the direct console, log in and go to Troubleshooting Options->Enable SSH. To enable it through the vSphere client, open the host and go to Configuration->Security Profile. Then in the Services section, click Properties, click on SSH, then Options. From there you can set the startup policy, and also start the service.

Once I had SSH access enabled, I logged into the host holding guest1. Now you want to get into the datastore where the virtual hard disk files are stored for the guest, and navigation commands are the same as you'd use in Linux. For me it was cd /vmfs/volumes/datastore1/guest1. Once I was in that directory I could use the ls command to list all the files. What I needed where the guest1.vmdk, guest1-0000001.vmdk, guest1-000002.vmdk, etc files. For my machine, I had guest1.vmdk, and then three add-ons, so four total virtual disk files to look at.

At this point you'll want to use your favorite text editor to open these files. Personally, I use vi, but you can use whatever you'd like that is available. Open each of the virtual disk files and note the value at the top for CID and parentCID, then close and move to the next. The parent CID of guest1.vmdk should be something like ffffffff. Once I had opened each of my files I had this


guest1.vmdk
CID: 32b76102, parentCID: ffffffff

guest1-0000001.vmdk
CID: 7d3d984f, parentCID: fa1f4813

guest1-000002.vmdk
CID: 49eb6c66, parentCID: 6e1b350e

guest1-000003.vmdk
CID: fa1f4813, parentCID: 49eb6c66

Now is where you get to solve the puzzle. Which CID or parentCID is incorrect and screwing up your virtual hard disk? I had to draw it out, but what I ended up with is

6e1b350e (???)<-49eb6c66 d3d984f="" fa1f4813="" guest1-000001="" guest1-000002="" guest1-000003="" p="">
For me it was that the parentCID value on guest2 was pointing to an unknown CID. Once I found that out, I opened guest1-000002.vmdk in vi and changed the parentCID value to the CID value of guest1.vmdk, 32b76102. Saved and closed, then booted up guest1 without any other problems.

I did try it a second time just to see, and in that case the same thing happened on the same file. It looks like when I mount the virtual disk in guest2, the CID value on the primary vmdk file gets changed. All the others stay the same, so if you can find the parentCID value that is going nowhere and change it, then you're back in business.


Articles I found that helped me come out with the solution

VMWare KB about hard disks failing to open

VMWare KB about resolving CID mismatch on virtual hard disks

Enabling ESXi Shell or SSH access to ESXi host

Editing VMWare config files

Monday, June 30, 2014

Install Symantec's Backup Exec 2010 Mac OS X agent in OS X 10.9 or 10.10

Updated: 10/23/15

I reinstalled the OS on my OS X server and upgraded to 10.10 Yosemite. In that process I had to reinstall the Backup Exec agent, and I'm still running 2010. It was quick and simple, even though I ran into the same error about Switch.pm missing. Since it's been so long I looked it up rather than referring back to this post, and found an easier command than below on this blog. In Terminal, run sudo cpan -f Switch. You'll likely get a message that says something about your system needing XCode installed, and you'll have to select Install. Once that finishes, re-run sudo cpan -f Switch and let it autoconfigure itself. Once done, go ahead and run the Backup Exec Agent installer and any service packs you have for it and it should work just fine. Make sure to start the agent when you're finished.

Original Post:

If you're like me and hated the newer interface Symantec introduced in Backup Exec 2012, you may still be running Backup Exec 2010 instead. Now, if that's the case and you introduce a new Mac running OS X 10.9 Mavericks into your list of machines to backup you're going to run into a problem. The BE2010 agent will not install in OS X 10.9 and will give an error message instead about not being able to locate Switch.pm. Before you jump to the conclusion that you need to upgrade Backup Exec to a version compatible with OS X 10.9, keep reading.

I ran across this today with a new OS X 10.9 Mac Mini server. It turns out that in Mavericks the Switch.pm has been deprecated from the version of Perl that comes with the OS, and the RAMS agent installer relies on that to run. It also turns out that adding it back to the OS is a pretty simple process, and once done the RAMS installer runs just fine. Here's what I did

1. Open a Terminal window on the OS X server
2. Run the command cpan install
3. Follow through the on-screen setup for CPAN (Comprehensive Perl Archive Network). I used the defaults
3a. During the install you will be prompted by the OS that you need to install make in order for the installer to continue. Click OK and let make install
4. Now CPAN should be installed so start it with sudo cpan. You need sudo access to install the Switch.pm file
5. You should see cpan (1) >. Type install Switch and press Enter. This will install the missing module back
to your Perl library
6. Once that's done type quit and press Enter to get back to the main Terminal

Now you should be able to run the RAMS installer package normally, and this time it should work.

Before doing this though, you should really ask yourself if you should. It seems to be working fine for me, but the Switch module was deprecated and left out for some reason. I make NO guarantees about this process, other than that it worked for me. If you have a test network available, please make sure to utilize that first before making this change in your production environment. Like I said, this was figured out the same day that I posted it, so I haven't done enough testing to say that it won't adversely effect the system, RAMS performance, or both.

This is the original article I came across while trying to get the RAMS installer to work that led me to write this article.

Monday, June 9, 2014

Free up inactive memory in OS X

Let me preface this by stating that normally you should not have to do this, and it is typically best to let the OS manage the RAM usage. I suggest you use this method only when necessary rather than as a standard practice. I'm also making an assumption that your familiar with the Terminal app, primarily in the situation where you want to schedule this. If you are not, you may need to do some research on Terminal outside of this post.

If you're like me, you've had an issue on an OS X server and when you check it out there is little to no free RAM available but you have a bunch stuck in the state known as inactive memory. This typically indicates one of two things. Either you need more RAM to run your server applications effectively, or something is causing a memory leak. In my case it is a little of both. However, running out and picking up Apple Server memory on a whim isn't always an option. Tracking down a faulting program isn't always easy, or quick either. If you're struggling to keep an OS X server accessible while waiting for an upgrade window or to give yourself time to troubleshoot, or if you're having problems with the personal Mac's memory allocation, this may be a temporary workaround you can use.

There is a command named "purge" that will free up the inactive memory. You can read more about it on the purge man page. You can simply open Terminal and issue the command, then press Enter. You may need to use sudo purge, but nonetheless you can invoke purge and free up some RAM immediately.

However, if this is a server or continues happening, you may want to script this command and call it on a schedule. While waiting for a shipment of RAM for a couple OS X servers having this issue, I used this script provided by Daniel Payne on stackoverflow.com in this article.

#!/bin/bash
free=`vm_stat | grep free | awk '{print $3}'`
freer=${free%%.*}
if [ "$freer" -lt "18000" ]
then
    nice purge
fi

I opened Terminal and used vi to create the script file, but you should be able to use Textedit or any text editor you'd like. Just make sure to name it with the extension of .sh at the end, so your file should be named something like freeRAM.sh. What this script does is run the purge command if there are less than 18000 free memory pages available. You will want to modify that value to match the minimum amount of memory you will accept before running the command. This eliminates unnecessary running of the purge command. Remember that this is the value in memory pages, not B/KB/MB/GB. If you don't know page size your system is using, you can run vm_stat within Terminal and it will tell you in the first line. For my server I was using the default page size of 4096 bytes. This means that if I wanted to run the purge command if there was less than 200MB of free memory, I would need to substitue the 18000 in the script above with

200MB * 1024KB/1MB * 1024B/KB * 1PG/4096/KB = 51200 (instead of 18000)

Once you have your shell script saved you need to add it to the schedule. You can use cron, but it appears to be deprecated in newer version of OS X so instead we'll use launchctrl. I used the guide found here to create my .plist file and get it into the launch daemons schedule.

You can also create your script using Automator and then schedule it using iCal. However, I wanted to run the script multiple times a day, and it appears that using iCal allows once a day as the most frequent option.

Monday, May 19, 2014

OS X Active Directory Users losing admin privileges when offline

For anyone using Directory Services in OS X to bind the Mac to a Windows domain, you've likely seen the option to allow administration by..., where you can define groups to administer the machine. I have a security group setup in Active Directory specifically for this, and whenever I bind the Mac to the domain I add that group and turn that option on. However, once in a while, when a machine is not able to directly authenticate with an Active Directory server, domain users do not have local admin rights. Typically admin rights come back the next time the machine is able to communicate with Active Directory, but in the meantime it is an annoyance while offline. Fortunately, it appears that I'm not the only one who has been dealing with this. I only wish I had spent some time researching it sooner.

Previously, my workaround to this problem has been to remove the Mac from the Active Directory domain, and then rejoin. While this has worked, it is just a workaround rather than a solution. It appears that someone with the same issue has found the actual problem, and also posted the solution. What is apparently happening is that even though those groups are supposed to be allowed to administer the computer according to the setting in Directory Services, the accounts are not added to the local admin group on the Mac. You can fix this by opening a Terminal session, and running the following command:

dseditgroup -n /Local/Default -o edit -u localUsername -p password -a accountToAdd -t user admin

*UPDATE*

Rather than use the above command, I found simply using sudo removes the need for the -u and -p switches so you can use the following.

sudo dseditgroup -n /Local/Default -o edit -a accountToAdd -t user admin

-n = node
-u = local username used to authenticate to make the change
-p = password for user defined with -u
-a = name of account to add to the admin group
-t = type of account you're adding
admin = group name

You'll want to use your own information for -u, -p, and -a. -t can take group as an option (instead of user). I haven't tried that yet, but it should allow you to add an entire security group to the local admin group in case you have multiple users for that one machine

Now, I believe this may do the same thing as well if you're not comfortable using Terminal to issue that command. You'll need to have login info for an actual local admin account, and the domain account you want to grant admin rights to must have logged in to the machine at least once already. Simulate being offline by turning off the wifi connection and disconnecting any LAN cable(s). Once you're offline, go into System Preferences->Accounts, click the user that should have local admin rights and check the box that says "Allow user to administer this computer". Then reconnect your network connection and reboot.

The two articles I found related to this that I used are:

https://discussions.apple.com/message/16026679#16026679

https://discussions.apple.com/message/22540531#22540531




Wednesday, May 14, 2014

ProcessExplorer "Unable to extract 64-bit image" error

The ProcessExplorer program is a very useful utility. I needed it today to track down a file lock, but upon trying to run it on my Windows 7 64-bit machine, I kept getting an error telling me "Unable to extract 64-bit image...". A few Google searches mentioned this being caused by a permissions error, but this didn't make sense since I'm an admin on the machine. After I ran across this on the SysInternals forum, then I realized that those saying that permissions were the problem weren't wrong, but that their answer wasn't specific enough.

Upon running the ProcessExplorer executable, it will extract the 64-bit version of the program to the AppData/Local/Temp folder and attempt to run from there. However, if you're like me and have restrictions on applications running from the Temp folder, this will cause the error. To get around it I simply navigated to my temp directory and move the procexp64.exe file to my Desktop and executed it from there. It opened right up and I was able to get back to what I needed ProcessExplorer for.

By default the AppData directory is hidden. The quickest way to get there is by clicking Start, then type %tmp% into the Search box and press Enter. Or type the path directly into the navigation bar, or choose to show hidden files.

Full path to the temp folder is C:\Users\"your username"\AppData\Local\Temp

Thursday, March 20, 2014

Toshiba Mobile LCD and IE 10 or 11 crash

I have one user with a Toshiba Portege ultrabook, and it's been fine. He also wanted a Toshiba mobile LCD screen to bring along for more screen real estate. Eventually his machine started having issues with Internet Explorer, where IE would crash immediately upon opening. I assumed it was a corrupt IE install since IE 10 had just came out, but uninstalling/reinstalling IE 10 didn't help. IE 9 worked fine though. After going through plenty of additional troubleshooting I finally found the cause of the issue. It was the DisplayLink driver that was installed with the mobile LCD screen. When that driver was installed IE would stop working, and as soon as it was uninstalled IE would go back to normal. Luckily DisplayLink had newer drivers available to download, and the driver worked fine with both IE 10 and IE 11.

For anyone scratching your head trying to find a solution to IE 10 or 11 crashing, if you have a portable LCD screen in use that uses DisplayLink drivers, try updating those first. It'll save you plenty of head scratching and unnecessary malware scans.

Friday, March 14, 2014

Configure VLAN(s) and enable routing on an HP Procurve switch

If you're running a managed HP Procurve switch and want to take advantage of VLANs to subnet your network, it's pretty easy. Here's a diagram of my example



In this example we have two VLANs (VLAN 1 and VLAN 2). VLAN 2 is setup just for workstations and must connect to VLAN 1 for DHCP, DNS, and Internet access.

In case the image is too small, on the switch, ports 1-24 are being designated as part of VLAN 1, and 25-48 are part of VLAN 2. VLAN 1 is the 192.168.10.0/24 subnet, and VLAN 2 is the 192.168.20.0/24 subnet.

On my primary and secondary DNS/DHCP servers, I have a DHCP scope setup for the primary network (VLAN 1), and another scope setup for VLAN 2. In my DHCP options for both, I set the primary DNS server to 192.168.10.10, and the secondary to 192.168.10.11. For VLAN 1, I set the router to 192.168.10.1, but on VLAN 2 I set the router to 192.168.20.254 since the default gateway needs to be found within the same subnet.

To actually set this up, first, you would telnet into your Procurve switch, which I'm hoping you know how to do if you're going to attempt setting up a VLAN. You'll need enable access on the switch as well. Once you've logged into the switch and are at the terminal, here is what I would enter to set up the above example. I've added comments/explanations on all lines, so be aware that you do not want to enter the - (.....) from the lines into the terminal window

enable - (enables admin access)
conf t - (enters configuration mode using the terminal)
ip routing - (enabled IP-based routing, which is required to allow the two VLANs to communicate)
vlan 1 - (will enter the configuration mode for vlan 1, which should exist by default on the switch)
untag 1-24 - (untags ports 1-24 on the switch to indicate they're going to be restricted to vlan 1)
ip address 192.168.10.254/24 - (assigns the IP address of 192.168.10.254 to the VLAN 1 interface)
vlan 2 - (will create vlan 2 if it doesn't already exist, then enters configuration mode for it)
untag 25-48 - (untags ports 25-48 on the switch to indicate they're going to be restricted to vlan 2)
ip address 192.168.20.254/24 - (assigns the IP address of 192.168.20.254 to the VLAN 1 interface)
ip helper-address 192.168.10.10 - (sets VLAN 2 to send DHCP packets to the primary DHCP server)
ip helper-address 192.168.10.11 - (sets VLAN 2 to send DHCP packets to the secondary DHCP server)
ip route 0.0.0.0 0.0.0.0 192.168.10.1 - (sets the default route to the default gateway in VLAN 1)
write mem - (commits the changes you made to the configuration stored in memory on the switch)
end - (exits configuration mode)
exit - (exits enable mode)
exit - (logs you off from your telnet session)

The one issue I ran into when I first did it is I had "ip default-gateway 192.168.10.1" set on my switch and thought that was good enough for my VLAN 2 to get to the Internet. However, that is only effective when ip routing is disabled, and for the VLANs to communicate ip routing needs to be turned on. That requires you to add an actual static route, or use ip default-network if it's an available option. For more information on that see this link. It's from Cisco, but the same applies to the Procurve devices. That link explains the differences between the default gateway options, and what routing protocols are affected by each.

The one thing I didn't touch on here is setting your actual routing to be able to reach VLAN 2. For that you'll have to decide what is best because it depends on your network and routing devices and protocols in use. In my example, I need to setup a route in VLAN 1 that would send traffic for 192.168.20.0/24 to 192.168.10.254 (the switch's IP on VLAN 1). If you have OSPF configured on your network and your switch participates, then you likely have nothing to do here. For my network, the switch doesn't support OSPF and the router is managed by my ISP and I have no access. In order to get traffic to VLAN 2, I added a static route to my firewall for it. That way it still gets advertised over OSPF and VLAN 2 can be reached.

When I initially decided to do this, I used a few articles to come up with the final configuration. In case they may be helpful to you:


Happy VLANing!






Thursday, October 31, 2013

Convert user mailbox to resource when using Exchange Online

I am in the middle of a cutover migration from an Exchange 2003 environment to Exchange Online. One issue with the cutover is that all our resource mailboxes were imported as user mailboxes, rather than resources. Luckily, using PowerShell, it is easy to convert these and I only had ~10 so it was easy enough to do. If you have a lot of resource mailboxes to convert you may want to script it instead. Anyway, here are the instructions which I originally found from here:

1. Connect Windows PowerShell to the Service. Refer to the article below.
http://help.outlook.com/en-us/140/cc546278.aspx

2. Run set-mailbox mailboxName -type room, substituting your specific mailbox name in for mailboxName, and also specifying either room OR equipment. The example is to convert to a room resource

Friday, October 25, 2013

Cannot get mail connection to server is unavailable error on iOS device

The typical answers to this would be verify your username/password/server address, and make sure you have an active data connection and can access the server. However, I'm posting this because I ran across a very odd solution to this error message when those first two failed.

I had a user who could not receive any new emails on her iPhone 5 through an Exchange ActiveSync connection, yet she could send and also had the contacts and calendar working fine. It happened out of nowhere and I was able to replicate the issue on a 2nd iPhone. No other users were reporting issues, and the user's email was working fine in Outlook and OWA. It was very odd.

After doing some extensive troubleshooting I finally tracked down the issue and decided to post it in case anyone else runs across the same problem and is scratching their head. The user had used an emoticon when sending an email, and had two replies to that message in her Inbox. However, that emoticon had gotten corrupted on the receiving end, so when sent back to her in the reply it actually made the messages unreadable on the iPhone and prevented any newer emails from being downloaded as well. As soon as I removed those 2 emails from the Inbox, the iPhone began working correctly again.

Thursday, October 3, 2013

Error while trying to export to PST using Export-Mailbox on Exchange 2007

This is probably valid for more than just Exchange 2007, but that's the version I was working on where I ran into the problem. After finally getting myself setup with a machine running a 32-bit Windows OS, and with both Exchange Management Tools and Outlook 2007 installed, I tried running my Export-Mailbox command. No go, and in checking the migration log file the error looked something like:

Error was found for user1 (user1@email.com) because: Error occurred in the step: Moving messages. Failed to copy messages to the destination mailbox store with error: 
MAPI or an unspecified service provider.
ID no: 00000000-0000-00000000, error code: -1056749164

I scratched my head on this one for a while after the Google results for the error code didn't show any results. Luckily my brain still works without Google (kind of...) and I had to smack myself a little for how simple the fix was. I didn't have full access to the mailboxes I was trying to export with the account I was logged in with! Once I realized that I simply ran

Add-MailboxPermission -Identity user1 -User myaccount -AccessRights FullAccess

Once that was done I tried my Export-Mailbox command again and viola! Exporting to PST files from the Exchange 2007 information store worked as expected.