Recovering Apple’s Wiki After Storage Failure or How I Learned to Love pg_resetxlog

Posted: February 2nd, 2016 | Author: | Filed under: Mac OS X, Mac OS X Server, Mountain Lion, postgres, Wiki | No Comments »

facepalmRecently I received a panicked phone call from a fellow sysadmin who was in a real jam. He had a customer who was dumping all their knowledge into Apple’s Wiki system running on top of Mountain Lion and Server 2.2.5. The storage system in the mini failed and they had to recover from backup, however the backup was setup using Carbon Copy Cloner and as we all know you cannot rely on a file-based backup system to backup a running postgres database.

After the data was restored the machine did boot but all the postgres services would not start, including the wiki. After reviewing the logs for quite some time I found some entries of pgstat wait timeout and then no log entries for about a day. I assumed that this was our hard drive failure window. Then two days later the log started producing tons of postgres crash statements, launchctl statements and this little nugget Jan 19th 13:29 database system was interrupted This was all the information I needed. From what I can tell, between the time that Carbon Copy Cloner calculated changes and the time that it copied the data some minute things changed within the database and so CCC didn’t get a proper clone. It appears that this error is caused when the database engine no longer knows where to start writing data back into the database. Basically, the counters were broken and had to be reset. Luckily postgres makes a tool called pg_resetxlog

The command has this basic structure:

pg_resetxlog
-x XID set next transaction ID
-m XID set next multitransaction ID
-o OID set next OID
-l TLI,FILE,SEG force minimum WAL starting location for new transaction log
/path/to/database/directory

Now the Apple Wiki postgres data is held within /Library/Server/PostgreSQL\ For\ Server\ Services/Data which is an important detail to hold onto. Within this directory are all the bits of info you’ll need to run the following calculations. You’ll also need this decimal to hex converter.

Source: http://www.postgresql.org/docs/9.0/static/app-pgresetxlog.html

A safe value for the next transaction ID (-x) can be determined by looking for the numerically largest file name in the directory pg_clog under the aforementioned postgres data directory, adding one, and then multiplying by 1048576. Note that the file names are in hexadecimal. It is usually easiest to specify the switch value in hexadecimal too. For example, if 0011 is the largest entry in pg_clog, -x 0x1200000 will work (five trailing zeroes provide the proper multiplier).

A safe value for the next multitransaction ID (-m) can be determined by looking for the numerically largest file name in the directory pg_multixact/offsets under the data directory, adding one, and then multiplying by 65536. As above, the file names are in hexadecimal, so the easiest way to do this is to specify the switch value in hexadecimal and add four zeroes.

A safe value for the next multitransaction offset (-O) can be determined by looking for the numerically largest file name in the directory pg_multixact/members under the data directory, adding one, and then multiplying by 65536. As above, the file names are in hexadecimal, so the easiest way to do this is to specify the switch value in hexadecimal and add four zeroes.

The WAL starting address (-l) should be larger than any WAL segment file name currently existing in the directory pg_xlog under the data directory. These names are also in hexadecimal and have three parts. The first part is the “timeline ID” and should usually be kept the same. Do not choose a value larger than 255 (0xFF) for the third part; instead increment the second part and reset the third part to 0. For example, if 00000001000000320000004A is the largest entry in pg_xlog, -l 0x1,0x32,0x4B will work; but if the largest entry is 000000010000003A000000FF, choose -l 0x1,0x3B,0x0 or more.

Once you have these four values you’re ready to try it out on your database. But before I began I requested a full bootable clone of the server as it was when they restored it, then I took this cloned and placed it into a VM in Fusion and snapped the VM before trying anything. Also, don’t forget that when you want to issue commands to the Apple postgres service you have to use the full path to the commands as well as use the _postgres user. My final command, which recovered the wiki system AND profile manager, looked like this:

sudo -u _postgres /Applications/Server.app/Contents/ServerRoot/usr/bin/pg_resetxlog -f -x 0x100000 -m 0x10000 -o 0x10000 -l 0x1,0x2,0x18 /Library/Server/PostgreSQL\ For\ Server\ Services/Data

Feel free to reach out if you are having issues.


Open Directory Replication 10.8.5 problems with Kerio Connnect 8.3.0

Posted: June 22nd, 2014 | Author: | Filed under: Kerberos, Kerio, LDAP, Mac OS X, Mac OS X Server, Mountain Lion, Open Directory | Tags: , , , , | No Comments »

kms_bubbleI recently was hired to implement an Open Directory Master/Replica into a network that wanted to leverage Kerio Connect mail server. At first, all seemed fine. I created the directory, the replica, and installed the kerio extension on both servers as was instructed by the fine folks at Kerio.┬áNow I’d just like to say that this is different than what I remember in the days of 10.6. Back then you only had to install the OD extension on the master, the replica would then copy the schema over so that it could import the extended schema data at that time.

The problem comes into play when you have a master with already provisioned users in Kerio and you want to add an OD replica. Since the replica does not copy over the extended LDAP schema it is unable to replicate any provisioned users. The result is that those users will not exist in the replica which is bad news if you have services relying on that replica. To resolve this problem use the following procedure on the replica you wish to build:

sudo slapconfig -createreplica <master IP> diradmin

Once complete install the Kerio extention.

slapconfig -stopldapserver
slapadd -v -F /etc/openldap/slapd.d -c -l /var/db/openldap/openldap-data/backup.ldif
slapconfig -startldapserver

#gowellandinpiece
#replication


Automated Backups of Mac OS X Server 2.2.2

Posted: April 27th, 2014 | Author: | Filed under: DNS, Mac OS X, Mac OS X Server, Mountain Lion, Open Directory | No Comments »

Hi Everybody! dr-nick-riviera

So I’ve been in the Mac game for quite some time now and all along I was always longing for a good automated backup solution. A few years ago myself and a colleague got together and wrote osx-backup.sh. A simple shell script with a few variables inside. Simply edit the shell script and then install as a cronjob to run nightly. Features of this backup script include:

  • Open Directory archiving
  • Service Plists
  • CalDAV/CardDAV database
  • Profile Manager database
  • DNS records
  • Wiki database and binary files
  • Webmail

I’ve been using this script for years now under 10.6, 10.7 and 10.8. The version listed here is for Server 2.2.2 under 10.8.5

Restoration of these backups is fairly simple to do as long as you know some postgres commands. Here’s the article on how to restore the wiki.

Calendar, webmail are fairly similar. DNS restoration is just a matter of placing the files back in /var/named and /etc/named.conf

If you need to restore open directory archive you should use Apple’s latest knowledge base instructions. Just make sure that the server hostname matches the backup.

To restore OS X Server setting plists:

sudo serveradmin settings < /path/to/your-sa_backup-servicename-plist

Get the code here.


Zentyal 3.0, Mountain Lion, Kerberos and SSO

Posted: March 2nd, 2013 | Author: | Filed under: Blog, Kerberos, Mac OS X, Mountain Lion, Open Directory, Zentyal | Tags: , , , , , , | No Comments »
Now with Zentyal you can kerberize your shoes.

Now with Zentyal, you can kerberize your shoes.

This article is a continuation of a really great read by shabangs.net His article is great to bind your Macintosh to a Zentyal directory server however, after completing the how-to I was unable to change a network user’s password, store a local copy of the network user’s password for “mobility” nor leverage some great single sign on services from zentyal.

What we will attempt is to configure /etc/krb5.conf for Mac OS X 10.8, Mountain Lion, so that we will receive a TGT from zentyal when the user either logs in or wakes the computer from sleep.

First you need to get the kerberos realm. To do this sign into Zentyal and go to Users and Groups. In here you’re looking for the LDAP search base, this base will also be your Kerberos realm.

Now we want to search and replace EXAMPLE.COM with that realm, and replace your.server.example.com with the FQDN of your Zentyal server. Only set the dns_lookup_* values to true if you’re using the Zentyal server for DNS.

All edits are client side ONLY
If /etc/krb5.conf does not exist then just create it.

[libdefaults]
default_realm = EXAMPLE.COM
dns_lookup_kdc = true
dns_lookup_realm = true
default_tgs_enctypes = arcfour-hmac-md5 des-cbc-md5 dec-cbc-crc
default_tkt_enctypes = arcfour-hmac-md5 des-cbc-md5 dec-cbc-crc
preferred_enctypes = arcfour-hmac-md5 des-cbc-md5 dec-cbc-crc

[realms]
EXAMPLE.COM = {
admin_server = your.server.example.com
kdc = your.server.example.com
kpasswd = your.server.example.com
}

[kadmin]
default_keys = des-cbc-crc:pw-salt des-cbc-md5:pw-salt arcfour-hmac-md5:pw-salt aes256-cts-hmac-sha1-96:pw-salt aes128-cts-hmac-sha1-96:pw-salt

In order to obtain a Ticket Granting Ticket (TGT) when logging in via the login window, edit /etc/pam.d/authorization and append default_principal option to the pam_krb5.so line.


auth optional pam_krb5.so use_first_pass use_kcminit default_principal

In order to obtain a Ticket Granting Ticket (TGT) when authenticating to the Screen Saver, edit /etc/pam.d/screensaver and append default_principal option to the pam_krb5.so line.


auth optional pam_krb5.so use_first_pass use_kcminit default_principal

Now sign out and back in as a network user, open a terminal and type klist You should get something like:


lisa:~ test$ klist
Credentials cache: API:51104:6
Principal: test@EXAMPLE.COM

Issued Expires Principal
Mar 2 09:28:04 Mar 2 19:28:04 krbtgt/EXAMPLE.COM@EXAMPLE.COM

If so, great! This means kerberos is running, now try to change the user’s Open Directory password. It should succeed as well. If not make sure you have the console open to see what’s going on. 99% of the time it’s a DNS issue or the clocks on your workstation is out of sync with Zentyal.

Now try to mount an SMB volume from the Zentyal server, it *should* mount without credentials and a new ticket will appear in the output of klist


Issued Expires Principal
Mar 2 09:34:52 Mar 2 19:34:48 krbtgt/EXAMPLE.COM@EXAMPLE.COM
Mar 2 09:34:56 Mar 2 19:34:48 cifs/your.server.example.com@EXAMPLE.COM