Monday, November 1, 2010

SQL unions and the text data type

Transact-SQL supports a data type named text. It's deprecated and may not be supported forever. It's similar to nvarchar(max).

I ran into an odd problem related to the text data type. I tried to do a query with a union, something like this:

CREATE TABLE [dbo].[t1](
[c1] [int] NULL,
[c2] [text] NULL
)

CREATE TABLE [dbo].[t2](
[c1] [int] NULL,
[c2] [text] NULL
)

SELECT * FROM t1 UNION SELECT * FROM t2


The SELECT statement fails with this error message: The text data type cannot be selected as DISTINCT because it is not comparable.

A bit odd, because there's no reference to DISTINCT in my query, but SQL Server 2005 must use DISTINCT behind the scenese to perform the UNION.

A workaround is to cast text to nvarchar(max), as in SELECT c1, CAST(c2 AS NVARCHAR(MAX)) AS c2 FROM t1 UNION SELECT c1, CAST(c2 AS NVARCHAR(MAX)) AS c2 FROM t2.

Two drawbacks to this workaround: First, you can't use SELECT *, since you must explicitly name any columns that need to be cast to nvarchar(max). Second, if your text columns happen to hold more data than the capacity of nvarchar(max), I'm not sure what would happen. Either an error message or data loss, I assume.

The real fix would be to change any text columns in your database to nvarchar(max) or some other data type that isn't deprecated and works with unions. That could take a fair amount of work if you have many stored procedures, etc. that refer to the existing text columns.

2 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Right on the money!

    Question is, do you need UNION (with the removal of potential double entries), or would UNION ALL be good enough?

    ReplyDelete