Gotchas Archives » SQLErudition.com

Use 130 Characters to Store an Object Name

November 11, 2015Aalam RangiLeave a comment

Some index maintenance scripts or dynamic SQL scripts use SQL object names as values for variables or columns i.e. database name, table name, index name etc.

SQL Server object names can be at most 128 characters long, so common wisdom is to declare the holding variable or column as SYSNAME data type or one of the alphanumeric datatype with a width of 128 characters.

Example –

declare @dbname sysname
declare @tablename nvarchar(128)
declare @indexname varchar(128)

This is technically correct.

Another technically correct thing to do is to use the QUOTENAME function to wrap the object name in [ and ] brackets. This handles those cases where there are special characters in the name or the name is a reserved keyword.

Example –

set @dbname = quotename(db_name(db_id()))
set @tablename = quotename(object_name(object_id))
set @indexname = quotename(object_name(object_id))

The Issue

The brackets will add two more characters to the value.

So for really long object names that are 128 characters in length, the value would be 128 + 2 = 130 characters.

The two extra characters will break the variable assignment or row insert statements with truncation error –

String or binary data would be truncated. [SQLSTATE 22001] (Error 8152).

In my shop, we do have some long index names and I have had to debug some scripts for this issue. And this is not the first time!

The Solution

I suggest that you use 130 as the width for the variables or columns that will store an object name. SQL Server object names can be maximum 128 characters long so using 130 characters in scripts will handle the extra two characters.

Gotcha – SSIS ImportExport Wizard Can Kill Your Diagrams

April 28, 2015Aalam Rangi1 Comment

Some things are meant to be learnt the hard way. And that is how I learnt about today’s gotcha.

Overview of [sysdiagrams]

I have been working on an ERD (Entity Relationship Diagram) recently and used the Database Diagram feature in SSMS for this purpose. When you try to create a diagram for the first time in a database, a message box asks you if you would like to create diagramming objects.

Confirmation dialog — Image 1 (Click to enlarge)

On clicking Yes, a system table by the name of [sysdiagrams] is created in the same database that you are creating the diagram in. The diagrams are stored in this table. SSMS shows the diagram in the Database Diagrams node of the Object Explorer tree view.

Diagram and its table — Image 2 (Click to enlarge)

The Scenario

I was creating my diagram in a test environment. At some point I had to refresh all data from the production environment. The easiest way for me to do a full refresh is to use the Import and Export Wizard. The wizard can be launched either via SSMS context menu or in the Business Intelligence Developer Studio as an SSIS project. As usual in the case of quick data refreshes, I selected all tables using the top-left Source checkbox in the wizard, and chose the options to delete and reinsert all rows with identity values.

Select all tables and reinsert rows — Image 3 (Click to enlarge)

When the wizard ran successfully, my database diagram at the destination test system was missing!

Gotcha

Upon some research and trials I found that that if the diagramming capabilities in the source and destination servers are enabled, the wizard includes the [sysdiagrams] table automatically in the list of tables to refresh. As you can see, there are no other system tables in the wizard except the [sysdiagrams] table so it is easy to miss it in a long list.

sysdiagrams is included — Image 4 (Click to enlarge)

So in my case, all data in the destination [sysdiagrams] table was deleted. There were no diagrams at the source so nothing was imported for this table. This outcome would have been the same with the drop and recreate option too because the destination table would have been recreated.

Conclusion

One needs to be careful while using the Import and Export Wizard. Uncheck this table in the selection list to preserve the diagrams at destination.

How to Have Standard Event Logging in SSIS and Avoid Traps

April 15, 2015June 16, 2016Aalam Rangi2 Comments

Event logging in SSIS gives a lot of valuable information about the run-time behavior and execution status of the SSIS package. Having a common minimum number of events is good for consistency in reports and general analysis. Lets say your team wants to ensure that all packages must log at least OnError, OnWarning, OnPreExecute and OnPostExecute events. If the package has a DataFlowTask then the BufferSizeTuning should also be logged. The developer can include more events to log as required but these mentioned previously are the minimum that must be included.

You can create pre-deployment checklists or documentation to ensure this minimum logging. Probably you already have that. As the number of developers and/or packages increase, it becomes difficult to ensure consistency in anything, not just for event logging. Therefore documentation, checklists and training are helpful to an extent only. Your requirement could be more complex than my five-event example above and thus more prone to to oversight.

The easiest way to ensure a common logging implementation would be a logging template that has all the minimum event pre-selected. The developer should just need to apply that to the package.

Event Logging in SSIS with a Template

I assume that you are already familiar with the concepts of event logging in SSIS so this post is not going to be a beginners level introduction to event logging. I will rather discuss options to have a minimum standard event logging across SSIS packages and teams with minimal effort. I’ll also mention some traps to avoid.

I am using a demo SSIS package with two Data Flow Tasks and an Execute SQL Task. I have enabled event logging in SSIS for the first few events at the package level for the sake of demonstration. The logging configuration options for the package node (which is the top node) are shown in the first image.

Image 1 - Logging configuration window for the package node — Image 1 – Event logging configuration window for the package node

The logging options at the child container node Data Flow Task 1 are shown in the second image. The configuration for other Data Flow and the Execute SQL Task look the same.

Image 2 - Logging configuration window at the child container node — Image 2 – Event logging configuration window at the child container node

The check marks for the tasks are grayed out which means they are inheriting the logging options from their parent, i.e. the package. To disable logging for a task, remove its check mark in the left tree view window.

TIP: Logging can also be disabled by going to the Control Flow canvas and changing the LoggingMode property of the task to Disabled.

The Trick

Now look at the bottom of the images again. Notice the Load… and Save… buttons? They do exactly what they say. You can set your logging options and save them as an XML template. Later, this XML template can be loaded into other packages to enable the same logging options.

The XML template file has nodes for each event. For example, the logging options for OnError event are saved like this –


-<EventsFilter Name="OnError">

-<Filter>

<Computer>true</Computer>

<Operator>true</Operator>

<SourceName>true</SourceName>

<SourceID>true</SourceID>
<ExecutionID>true</ExecutionID>

<MessageText>true</MessageText>

<DataBytes>true</DataBytes>

</Filter>

</EventsFilter>

Notice that the XML just mentions the event name, not the name of any task. This means that when the template file is loaded, this logging option will be set for any task where the event is applicable. More on this later.

The Traps

The OnError event is a generic event applicable to all tasks. Lets talk about events that are specific to tasks. For example, the BufferSizeTuning event is applicable just to the Data Flow Tasks, not Execute SQL Tasks.

When I proceed to set logging for BufferSizeTuning event, I have to set it individually in the Data Flow Task tree node. Notice the message at the bottom of the second image that says –

To enable unique logging options for this container, enable logging for it in the tree view.

This message is important in the context of saving and loading a template file too. When I save a template file, the logging options of just that tree view node are saved. For example, the BufferSizeTuning event will be saved in the template only if I am at the Data Flow task in the tree view. It will not be saved if I am at the Package or the Execute SQL task in the tree view.

The reverse is also true. When I load a template, its logging options are applied to just that node which I select in the tree view. For example, if I load a template at the Data Flow Task 1, the options will not be applied to the Data Flow Task 2 or the Execute SQL Task. If the template has an event that is not applicable to the task then that event’s settings will be ignored. For example, the BufferSizeTuning event logging option is meant for Data Flow Tasks so it will be ignored for the Execute SQL Task. The fact that non-relevant options are ignored can be helpful for us to consolidate all logging options in a single template file.

Conclusion

A package level Save and Load of a logging template is straight forward. But if you need to have logging for events that are specific to a task type, then consider creating a logging template for each type of task. Also, if your logging configuration requires anything else than the package level settings, remember to load the template for each task in the tree view.

Number of Template Files	How	Pros and Cons
Individual File per Task	Create one template file for each type of task. The file will have events applicable to that task.	Pros – Easier to know what type of tasks have a template and which ones do not.Cons – More files to manage.
Single File for All Tasks	Create a template file for each task. Then copy all event options in a single XML file.	Pros – One file is easier to manage.Cons – Not obvious which tasks are include. Need to put in comments in the XML file.

TSQL Gotcha – Order of expressions for a range search with BETWEEN

November 27, 2014Aalam RangiLeave a comment

Do you trust your users to always pass range search parameters in the correct order? Or do the users trust that the code will take care of a small thing like parameter ordering?

If you have a stored procedure or a script that accepts two parameters to do a range search using a BETWEEN keyword (or maybe with a >= and <=) then it is important to verify that the start expression and end expression are in the correct order. The correct order for numeric values is smaller value before a larger value. The correct order for character values is the dictionary order. A user might pass on the expression values the wrong way around and see no results returned.

The following demonstration has a small check for the correct order of values and to do a quick reorder to fix any issues. To help things a bit more, instead of using generic variable names like @param1 and @param2, they can be made self-documenting by being descriptive like @begin_param and @end_param.

Demonstration with numeric values:

declare @DemoTableNumeric table 
(col1 int)

-- Insert 10 rows in the demo table
insert into @DemoTableNumeric (col1)
values 
(1),(2),(3),(4),(5),
(6),(7),(8),(9),(10)

-- verify data
select * 
from @DemoTableNumeric

declare 
@param1 int, 
@param2 int

-- Assign values
-- Note: Param1 &gt; Param2
select 
@param1 = 7, 
@param2 = 4

-- The following return zero rows
-- because the first expression is
-- greater than the second expression
-- It is a wrong order of values.
select * 
from @DemoTableNumeric
where 
col1 between @param1 and @param2

select * 
from @DemoTableNumeric
where 
col1 &gt;= @param1 and col1 &lt;= @param2

-- It is important to verify the expression
-- values and reorder them if necessary
-- The following IF condition does that
if @param1 &gt; @param2
begin
	declare @temp int
	set @temp = @param1
	set @param1 = @param2
	set @param2 = @temp
end

-- Now both queries return rows
select * 
from @DemoTableNumeric
where 
col1 between @param1 and @param2

select * 
from @DemoTableNumeric
where 
col1 &gt;= @param1 and col1 &lt;= @param2

Demonstration with character values:

declare @DemoTableChar table 
(col1 varchar(10))

insert into @DemoTableChar (col1)
values
('Alpha'),
('Golf'),
('Kilo'),
('Oscar'),
('Tango'),
('Zulu')

-- verify data
select * 
from @DemoTableChar

declare 
@param1 varchar(10), 
@param2 varchar(10)

-- Note: Param1 &gt; Param2
select 
@param1 = 'Tango', 
@param2 = 'Golf'

-- This returns zero rows
-- because the first expression is
-- greater than the second expression
-- It is a wrong order of values.
select * 
from @DemoTableChar
where 
col1 between @param1 and @param2

select * 
from @DemoTableChar
where 
col1 &gt;= @param1 and col1 &lt;= @param2

-- It is important to verify the expression
-- values and reorder them if necessary
if @param1 &gt; @param2
begin
	declare @temp varchar(10)
	set @temp = @param1
	set @param1 = @param2
	set @param2 = @temp
end

-- Now both queries return rows
select * 
from @DemoTableChar
where 
col1 between @param1 and @param2

select * 
from @DemoTableChar
where 
col1 &gt;= @param1 and col1 &lt;= @param2

TSQL Gotcha – Putting a Condition in WHERE or ON clause: Does it Matter in TSQL?

May 12, 2014Aalam RangiLeave a comment

A filter condition can be syntactically put in the WHERE clause of the query or the ON clause too. But does it matter?

No, it does not matter for an INNER join. I found this one example of an exception in Martin Smith’s comment on Stack Overflow where using the GROUP BY ALL makes a difference in results. GROUP BY ALL is now a deprecated syntax (Feature ID 169) so probably we can ignore this exception. But it is always good to know.

On the other hand, Yes, it could affect the output in any of the OUTER joins.

This happens because when a filter criteria is made a part of the JOIN condition, it is applied at the time of reading the tables to identify the rows for the JOIN. The rows that do not satisfy the filter criteria are not included for the purpose of the JOIN and that causes NULLs in the OUTER JOINs.

However, with the filter criteria a part of the WHERE clause, it is applied after the query has been processed and the adhoc dataset has been created internally. The final dataset is reduced to just those rows which meet the filter criteria.

I will use a criteria of c.isactiveYN = ‘Y’ for demonstration below. Rows with a value of ‘Y’ are used for the join. The rows with ‘N’ or NULLs don’t match the criteria so they are left out.

As the examples show below, this seemingly minor difference can cause substantial difference in the query output of OUTER JOINs. This effect could cascade on to other subsequent tables in bigger queries that are joined to the original tables.

-- Create demo tables and insert some data
declare @customer table 
(id int, 
cname varchar(20), 
isactiveYN char(1))

declare @order table 
(id int, 
customerid int, 
orderdate date)

insert into @customer 
values
(1, 'Bob',   'Y'),
(2, 'Harry', 'N'),
(3, 'Max',   'Y')

insert into @order
values
(1, 1, '2014-1-1'),
(2, 1, '2014-2-2'),
(3, 2, '2014-3-3'),
(4, 2, '2014-4-4'),
(5, 3, '2014-5-5'),
(6, 3, '2014-6-6')

select * from @customer
select * from @order

Results :

Demonstration: The location of criteria has no effect on the results in INNER JOIN.

-- A simple INNER JOIN without any 
-- conditions to display all data:

-- INNER JOIN with out any conditions
select * 
from 
  @customer as c
inner join 
  @order as o
  on c.id = o.customerid

-- The following two queries differ
-- in the location of the criteria
-- but still show similar results

-- INNER JOIN with filter in WHERE clause
select * 
from 
  @customer as c
inner join 
  @order as o
  on c.id = o.customerid
where
  c.isactiveYN = 'Y'

-- INNER JOIN with filter in ON clause
select * 
from
  @customer as c
inner join 
  @order as o
  on c.id = o.customerid
  and c.isactiveYN = 'Y'

Results : The second and third results are similar.

Demonstration: LEFT OUTER joins with the condition in the WHERE clause and the ON clause

-- LEFT JOIN: Filter in WHERE clause
select * 
from 
  @customer as c
left outer join 
  @order as o
  on c.id = o.customerid
where 
  c.isactiveYN = 'Y'

-- LEFT JOIN: Filter in ON clause
select * 
from 
  @customer as c
left outer join
  @order as o
  on c.id = o.customerid
  and c.isactiveYN = 'Y'

Results : “Harry” is not in the first result-set because the row was filtered out by the WHERE clause. “Harry” is in the second result-set because it comes from the LEFT OUTER table but there is no matching row because it was not used for the JOIN.

Demonstration: RIGHT OUTER joins with the condition in the WHERE clause and the ON clause

-- RIGHT JOIN: Filter in WHERE clause
select * 
from 
  @customer as c
right outer join 
  @order as o
  on c.id = o.customerid
where 
  c.isactiveYN = 'Y'

-- RIGHT JOIN: Filter in ON clause
select * 
from 
  @customer as c
right outer join 
  @order as o
  on c.id = o.customerid
  and c.isactiveYN = 'Y'

Results : “Harry” is not in the first result-set because the row was filtered out by the WHERE clause. “Harry” is not in the second result-set because it has been filtered out and it is also not in the OUTER table.

Summary: When the criteria is in the WHERE clause, The results of the INNER, LEFT OUTER and the RIGHT OUTER queries are the same. Differences show up in the results of LEFT and RIGHT OUTER queries when the criteria is in the ON clause.

TSQL Gotcha – Quotes are optional for a Stored Procedure VARCHAR parameter without spaces

March 27, 2013Aalam RangiLeave a comment

If you want to assign some default value to a parameter of type VARCHAR in a stored procedure and the parameter value does not have any spaces then the quotes around the value are optional.

create procedure TestProc
(
@a varchar(20) = 'spaces in string', --must use quotes for string with spaces
@b varchar(20) = 'nospacesinstring',
@c varchar(20) = nospacesinstring --quotes are optional if no spaces in string
)
as
select @a, @b, @c;
;
exec TestProc;

TSQL Gotcha – ISNUMERIC Function Results

March 24, 2013Aalam RangiLeave a comment

The ISNUMERIC function (MSDN documentation) is used to validate if a string is a valid number. In the demonstration that follows, you would expect the results from 1 to 5 but look closely at the results 6, 7 and 8. There are multiple commas in the string and the last one is not even a number!

Well, a note at the MSDN documentation explains the True result for the ‘$’ sign at Id 8 –

ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($). For a complete list of currency symbols, see Using Monetary Data.

For the strings at Id 6 and 7 with commas, the function behaves that if the string is convertible to a number then the result will be True. Something to remember while writing those data validation routines.

DECLARE @t TABLE
( id int, col1 VARCHAR(10) )

INSERT INTO @t (id, col1)
VALUES (1, '-1001'),
(2, '0'),
(3, '1001.50'),
(4, '1,500'),
(5, 'abc'),
(6, '15,0,1'),
(7, '15,0,1.75'),
(8, '$')

SELECT id,
col1,
Isnumeric(col1) AS [IsNumeric]
FROM @t

TSQL Gotcha – Miss a comma

March 8, 2013Aalam RangiLeave a comment

Some minor omissions can go unnoticed and produce completely unexpected results.

The following code example has two seemingly similar SELECT statements but they produce different results. All because I missed out a comma in the second SELECT statement! SQL Server did not raise any error and quietly assumed that I have tried to use an alias name for the First Name column!

CREATE TABLE employee
(
employee_id INT,
firstname VARCHAR(50),
middleinitial VARCHAR(1),
lastname VARCHAR(50)
)
INSERT INTO employee
(employee_id, firstname, middleinitial, lastname) VALUES
(1001, 'Timothy', 'B', 'Parker'),
(1002, 'Henry', 'F', 'Williams'),
(1003, 'Jeff', NULL, 'Smith'),
(1004, 'Mary', 'M', 'Marsh'),
(1005, 'Miles', 'V', NULL)

--Correct statement
SELECT employee_id, firstname, middleinitial, lastname
FROM employee

-- Wrong statement with a comma missing
-- between the firstname and middleinitial columns.
-- SQL Server thinks it is an alias for firstname.
SELECT employee_id, firstname middleinitial, lastname
FROM employee

TSQL Gotcha – Empty string in INTEGER type column

February 23, 2013Aalam RangiLeave a comment

It is important to sanitize data.

If you try to put an empty string in an INTEGER type (or its cousin) columns, SQL Server converts it implicitly into a zero. This might not be the expected behavior for the application so it is good to be aware.

declare @v char(1) = '' -- this could have been a varchar too with the same results

declare @t table
(
colbigint  bigint,
colint int,
colsmallint smallint,
coltinyint tinyint
)

/* The wrong way that will change all blank strings to zero */
insert into @t
(
colbigint,
colint,
colsmallint,
coltinyint
)
values
(
@v,
@v,
@v,
@v
)

/* The correct way to sanitize */
insert into @t
(
colbigint,
colint,
colsmallint,
coltinyint
)
values
(
nullif(ltrim(@v), ''),
nullif(ltrim(@v), ''),
nullif(ltrim(@v), ''),
nullif(ltrim(@v), '')
)

select * from @t
go