Generating Audit Triggers in SQL

This is a placeholder post, but I wanted to get the code out there for someone.
Here's the actual DDL trigger. It really just intercepts creates and alters to tables and calls a stored proc to generate the audit trigger. The stored proc does the work. We support skipping audit on certain tables by setting an extended property on that table. We're considering the same idea on columns to skip audits on specific columns. In our case we use the RowGuid on each row as the identifier in the audit and we care about an application-specific ChangeUserID, not the user connected to SQL. And if you want to run it for all tables there's this sproc:

How I monitor for errors in SQL Server's ERRORLOG

Like any other complex system, things can go wrong in the night (or any other time of day) in SQLServer, and like any other complex system, there's a log of these things that happen. Windows has the Event Viewer, IIS as the W3C logs, and SQLServer has the ERRORLOG text file.

Typically located in \MSSQL\Log, the log file is a record of all "important" activity that has occurred since the server was last started. The term "important" is relative, and generally means "things Microsoft felt should be recorded somewhere." This includes backups, restores, IO errors, CLR loads, etc. In addition to the default things Microsoft puts there, you can tweak SQL Server to log additional rows to the log file using various DBCC trace flags. For example, turning on DBCC traceflags 3605 and 1222 will tell SQL Server that it should log detail information about deadlocks to the error log file -- In my opinion a good idea to always have this flag turned on.

Now, a wise DBA will review this error log file for abnormal activity on a daily basis, but there are two problems. First, the errorlog file can get HUGE! After all, it's a log of all activity (well, important activity) since the server started. And second, there's tons of normal activity that occurs that we know about and don't care about, but it gets logged. Finding the stuff we care about as DBAs among all the other "important" stuff is difficult at best. I've never found the secret DBCC flag called "Only Log the Stuff I Care About". So what do we do?

First, there's a system stored proc called sp_cycle_errorlog -- this does exactly what it says, it tells SQLServer to close the current errorlog and start a new one -- without having to restart SQL Server! The "current" error log is always called ERRORLOG, and the one before that is ERRORLOG.1, the one before that is ERRORLOG.2, etc. To be clear, there are lots of renames going on on a cycle. If you have 10 errorlogs then errorlog.10 is renamed to errorlog.11, errorlog.9 is renamed to errorlog.10, errorlog.8 is renamed to errorlog.9 .... errorlog.1 is renamed errorlog.2 and finally errorlog (the active log) is renamed errorlog.1 and a new log is started.

I have a job in SQL Server that runs at 6AM (why 6AM? Because it seems like a good time!) that has two steps. The first step is simply: Then I have a second step, and this is my version of the missing "Only Log the Stuff I Care About" flag. Since there are all sorts of things that can go wrong in SQL Server I didn't feel comfortable searching for specific things to show. But I do know what I don't want to see! Since the errorlog file is just a text file, I can use grep to rip through the file and pull out what I don't want. Yes, I said what I don't want! There's a seldom used (well, I seldom use it) option in grep called --invert. It reverses the functioning of grep. Grep will typically output all matching lines, with --invert is will output all non-matching lines. This way I can run grep on the errorlog and pull out all the stuff I don't want.

There are two other parts in my error monitoring process. The first is to email myself the results, and to do that I use blat. The second is to account for the fact that the errorlog is in unicode, and grep (at least the version I have) doesn't read unicode. The unicode problem is actually handled by the builtin "type" command. Seems that "typing" a file converts it from unicode to ascii, which is what grep wants.

The command that processes my error log is then: It's actually all on one line in the .cmd I run from w/in the SQL Agent job, but this makes it simpler to see in the blog posting. The first "line" does the conversion from unicode. I know the file name is always errorlog.1 as I run this right after the sp_cycle_errorlog procedure.

The next line runs the tail command to give me everything excluding the first 7 lines (starting in line 8) since the first 7 lines are boilerplate that I don't care about (copyright, version, etc.)

The next two lines pull out the log and database backups -- with hundreds of DBs, and logfile backups every 15 mintues plus nightly full backups this step pulls out lots of unwanted crap.

The next four lines deal with SQLCLR objects we're loading. We know they're getting unloaded due to memory pressure (something we're working on) so we just ignore the load and unload messages. Note however that I'm specifically looking for the load of the assembly "sqlserverobjects", any other load would show up.

The final line emails me the results. And a resulting email is shown below. And as you can see, I have an IO error around 17:45 (line 5). Never would have seen that amongst 10,000 other lines of unimportant important stuff! This is a simple process to code up, and then runs on it's own every day. A great way to see the "truly important" stuff happening in SQL Server.

Computing MD5 Hash of varbinary(max) in SQL Server CLR

SQL Server provides the HASHBYTES function which computes the MD5 hash value of the bytes passed to it with one problem... it only accepts the older (pre-2005) max length of 8000 bytes in a varbinary. That's useful, I suppose, if you're trying to get the MD5 hash of a password, or a name, but completely useless in a case like I had where we needed to compute the MD5 hash of a blob column -- and a blob colum where the max datalength is dozens if not hundreds of megabytes. I needed something different!

Our first pass was to write a command line application that selected the filedata column (the blob) from the table and then just computed the MD5 from w/in c#. That made some sense since the point of this was to check an export we were doing of the file data from blobs to files in the file system but it just seemed wrong somehow. If only HASHBYTES wasn't broken! Then it dawned on us, we could write a SQL CLR function that computed the hash value from w/in SQL Server. This made infinitely more sense.

If you haven't played with them, SQL CLR functions allow you to code a static method in your .Net language of choice (C#!) and then install it into SQL Server as if it was a function you'd written in TSQL. You can use it like any other function in a TSQL statement. SQL Server actually acts as a CLR host, starting the appropriate app domains and managing memory consumption (we'll touch on that later). Your .Net code is actually running IN the SQL Server process! If you're wondering what Microsoft's commitment to SQL CLR is, the new geographic data types in 2008 are implemented using SQL CLR functions! Even though HASHBYTES is broken, we could write our own hash method that did work on varbinary(max) values and still compute the hash value from w/in SQL Server.

My immediate requirement was for a MD5 hash of a varbinary value, but I wanted to make the api slightly more generic so I came up with a method signature that took the varbinary we wanted to hash and the type of hash to compute. This way the same method could handle MD5, SHA1, etc.

My first pass looked like the code below. And it worked great in testing. I passed in the blob to hash, and a string to tell me the type. For example: select dbo.ComputeHash(BlobColumn, 'md5') from TableName and it worked just fine (in test!). I installed it in production and let her rip!

As proof of why you should test corner cases two things happened in production. First, what happens when the code above attempts to compute the MD5 hash of a null? That is, the passed in SqlBinary value is null because the underlying column is null? Well, it goes boom! That's what happens. This isn't a big deal, as the fix is simple, if the Source value is null then return a null. I wasn't sure what the "right" value to return was, but I figure the hash of a null is different from the hash of an empty string, so returning null made the most sense. Problem fixed... rerun...

OK, now there's a new problem. The code ran fine for the first 5000 or so blobs it handled, and then boom! We got around to a blob that was 32MB in size and that was just too darn big. SQL Server killed the CLR app domain because it was consuming too much memory! Remember, the CLR integration is running in-process in SQL Server, and SQL Server's primary concern is to protect itself. When an app domain gets too big SQL Server kills it. Specifically you get:

Msg 6532, Level 16, State 49, Line 1
.NET Framework execution was aborted by escalation policy because of out of memory.

It took some Googling but I came across Chris Bernard's blog post Out of memory exception while attempting to do sql clr and that shed some light on the issue, thanks Chris!

However, I didn't like the idea of messing w/the SQL Server settings, especially since this was a one-time function we were running. So I needed a different solution. I knew the problem was that we were passing the blob in as one giant memory allocation. SQL Server probably buffered it up in its memory, then passed a copy to the CLR which probably also had a copy or two hanging around. This is all fine when you're passing 500 bytes, but with a 32MB blob it adds up fast! If only we could get raw access to the SQL data...

I learned long ago that if you find yourself saying "Gee, it would have made sense if Microsoft did 'x'" then they probably already do 'x', you just need to find out where/how. And this was no different. In addition to the SqlBinary class, there's also a SQLBytes class. And guess what, the SQLBytes class provides a property on it called Stream that is access to the underlying stream of bytes. Since the ComputeHash function takes a stream just as readily as it takes a byte[] we could simply pass the stream in and everything worked.

I kept an eye on the memory usage of the CLR with this new code and it never exceeded 75MB, a much better solution!

Next time we'll look at how to install the CLR function in SQL Server and we'll see why I needed the command line applications to create a hex representation of a file and to popupate the clipboard from the command line.