Jun 27 2010

My “NHibernate.Search and Lucene.NET” Presentation from CodeStock 2010

Category: Lucene | SQL | NHibernateMatt @ 15:36

CodeStock 2010 is over. I had a good time, talked with lots of cool people, and attended some great sessions.  Though I had to miss day 1 due to commitments for my new job at TrackAbout, I was able to attend and present on day 2.  I’d like to thank everyone that attended my presentation, and I hope everyone took something useful away from it.

I’ve uploaded my slides and code, so if you missed the session and are curious what was covered, or if you attended and want to see it again, check out the links below.  The slide deck is a little sparse on content since most of the session was me talking and writing code, but I’m working on a few blog posts that will fill the gaps (and possibly more).  Anyway, check out the links, and let me know if you have any questions or comments!

Adding powerful search capabilities to your application with Lucene.NET and NHibernate Search
Slides
Code

Tags:

Feb 10 2009

My day as a data analyst

Category: SQLMatt @ 09:17

That's right, I have fallen from grace as lead developer and have been reduced to writing mundane SQL to generate spreadsheets for customers.  Fortunately this is just a temporary assignment, but it's a PITA nonetheless. 

I did learn a few useful things today though.  First, SQL sucks.  And second, SQL sucks. Third, if you have a really complicated query that's going to require a lot of nested tables and selects, you can really save your sanity by refactoring things into UDFs.  Yeah, it's a little dirty having UDF helpers for what amounts to an ad-hoc query, but it really makes life simpler.  Consider the following (greatly simplified) example:

Task: Create a report of projects, one line per project.  Each line should contain a single column with all the personnel involved, another with all the organizations involved, and a link to the project.

What we have: Projects are stored in a table.  There is a one-to-many relationship to a table that contains organizations.  Projects are also one-to-many related to documents.  Each document is one-to-many related to a table containing people.  Each document also has a URL.  A project can have a lot of URLs because of this.

The (first) problem: We want to aggregate some of the one-to-many rows into a single field in the SQL output.  Yeah, that would be easy/easier with MySQL and group_concat, but this isn't MySQL, it's SQL Server.  There are ways around it, but I found that it was easier to just create UDFs for exactly what I needed, like so:

   1: --------------------------------------------
   2: -- Gets a string of all the distinct funding
   3: -- organizations for an activity.
   4: --------------------------------------------
   5: ALTER function [dbo].GetFundingOrgs
   6: (
   7:     @activityID uniqueidentifier
   8: )
   9: returns varchar(1000)
  10: as
  11: begin
  12:  
  13: declare @orgs as varchar(1000)
  14:  
  15: SET @orgs = ''
  16:  
  17: SELECT @orgs = @orgs + PrimaryName + ', '
  18: FROM Organizations
  19: INNER JOIN ActivityOrganizationMetadata 
  20:     ON Organizations.OrganizationID = ActivityOrganizationMetadata.Organization
  21: WHERE Activity = @activityID
  22: AND Propogated = 0
  23: GROUP BY PrimaryName
  24:  
  25: --Remove the trailing ', '
  26: IF LEN(@orgs) > 0 
  27:     SELECT @orgs = substring(@orgs, 1, LEN(@orgs)-1)
  28:  
  29: return ( @orgs )
  30:  
  31: end

I could have done something similar using nested CTEs and other hideous garbage, but this works well enough.

The second problem: We want a *single* URL for each activity.  That means we can't just do a join between the projects and documents, because then we get multiple rows for each activity.  Again, UDFs to the rescue:

   1: --------------------------------------------
   2: -- Gets the original URL for the first 
   3: -- document in the activity.
   4: --------------------------------------------
   5: ALTER function [dbo].GetOriginalUrlForActivity
   6: (
   7:     @activityID uniqueidentifier
   8: )
   9: returns varchar(100)
  10: as
  11: begin
  12:  
  13: return
  14: (
  15:     SELECT TOP 1 Url FROM DocumentsToQueries
  16:     INNER JOIN Documents ON DocumentID = Document
  17:     WHERE ActivityID = @activityID
  18:     AND Url LIKE 'http%'
  19: )
  20:  
  21: end;

And like magic, my report works!

Hopefully tomorrow will be back to non-SQL work. 

Tags:

Sep 29 2008

SQL Server Tip: sqlcmd variables

Category: SQLMatt @ 09:35

Here's a fun little tip you can use in your SQL scripts to make them scriptable via the sqlcmd utility while still supporting SSMS.  Say you want to make the name of your database configurable so you can script out the deployment of your database.  You could do that with a script like this:

   1: SET @dbName = '$(DBNAME)'
   2: ...
   3: EXEC sp_executesql 'CREATE DATABASE [' + @dbName + ']'
   4: ...


But what happens when someone decides to run your script in SSMS?  It will try to create a database named '$(DBNAME)'!  That's not even remotely what we want.

NOTE: Yeah, I know you can make SSMS execute the script in SQLCMD mode, but then it just dies with a 'variable not defined' error, which isn't what we want either.

What we'd like to do is have the script fall back to a default when run in SSMS.  That's easy to do, just add these two lines of code after your initial SET statement:

   1: if @dbName = '$' + '(DBNAME)'
   2:     SET @dbName = 'DefaultName'

All it's really doing is checking to see if @dbName is set to the string that would have been replaced by an actual value had the script been run via sqlcmd.  You do have to break it up into two parts though to prevent sqlcmd from performing substitution there, otherwise your sqlcmd script will always use the default name instead of whatever was passed in to it.

Tags: