none
How to use OR undervarious columns

    Question

  • Hi Gurus,

    In my sql i have to compare 15 columns and if they have value ='Y' then its Y else N. I wanted to use the CASE Statement but is there easy way instead of using OR 15 times?

    Thanks

    Thursday, February 7, 2019 10:00 PM

Answers

  • Another way:

    create table diagnose (id int identity(1,1),ad1 varchar(1),ad2 varchar(1),ad3 varchar(1),ad4 varchar(1)
    ,ad5 varchar(1),ad6 varchar(1),ad7 varchar(1),ad8 varchar(1) )
    GO
    
     insert diagnose (ad1,ad2,ad3,ad4,ad5,ad6,ad7,ad8) VALUES
    ('y','y','y','n','y','y','y','n'),
    ('y','y','y','y','y','y','y','n'),
    ('n','n','n','y','y','y','y','n'),
    ('y','y','y','y','y','y','y','n')
     
     
     Select Case when Len(concat (ad1,ad2,ad3,ad4,ad5,ad6,ad7,ad8))-Len(Replace( concat (ad1,ad2,ad3,ad4,ad5,ad6,ad7,ad8) ,'y',''))>0 then 'y' else 'n' end as col16 
     from diagnose
     
     
     
     drop table diagnose

    Friday, February 8, 2019 8:26 PM
    Moderator
  • Hi LordofPlains,

    Here you go:

    create table #diagnose (ad1 varchar,ad2 varchar,ad3 varchar,ad4 varchar)
     insert #diagnose (ad1,ad2,ad3,ad4) VALUES
    ('y','y','y','y'),
    ('n','y','y','n'),
    ('y','y','y','y'),
    ('n','n','n','y'),
    ('n','n','n','n')

    select *, col_16=
    case 
    when coalesce(nullif(ad1,'n'),nullif(ad2,'n'),nullif(ad3,'n'),nullif(ad4,'n')) = 'y' then 'y'
    else 'n'
    end
    from #diagnose
    drop table #diagnose

    Saturday, February 9, 2019 3:28 AM
  • Thanks Pedro and pituach

    create table diagnose (ad1 varchar,ad2 varchar,ad3 varchar,ad4 varchar)
    GO
    insert diagnose (ad1,ad2,ad3,ad4) VALUES
    (y,y,y,y),
    (n,y,y,n),
    (y,y,y,y),
    (n,n,n,y)
    
    GO

    i have15 columns like this , my 16th column is a user created one using the case statement 

    case when ad1='y' or 'ad2='y' or ad3='y' or ad4='y' then 'y' else 'n' end

    Thanks

    Hi <?!? would be nice to have a name ?!?>

    This is exactly what I guessed that you are doing, and what I wrote about. It is clear that you should stop everything > take a step back > and think about your database architecture and tables structure! It might be a very good idea at this point (according to this sample of what you do) to take an external consultant to follow your work and designing the system.

    This small code include so many mistakes and misunderstanding about SQL Server that I afraid to think about your entire system performance.

    Let's mentioned two important points, and I really hope for your company sake that you read this:

    1. Using varchar make no sense in most cases

    When you use the type varchar without clarify specific length SQL Server consider it as the default which is length 1. This is the same as using VARCHAR(1)

    If the MAX length is 1 then why the hell do you use VAR?!?

    "For single-byte encoding character sets such as Latin, the storage size is n bytes + 2 bytes"

    When we use VAR, SQL Server must store the length of the data in addition to the data itself, which cost another 2 bytes behind the scenes. This means that in order to store "y" you actually uses 3 bytes, And even in order to store empty string "" you will use 2 bytes
    Using CHAR(1) you will only use 1 byte! Yes... CHAR(1) always uses less size than VARCHAR(1).
    What is the the sense in using VARCHAR(1)?!?
    In fact VARCHAR(n) can be useful in theory only for n > 2

     

    2. Using true/false values

    You want to store "Y"/"N" value which is exactly what I spoke about above! You basically need true/false solution, which in SQL Server is done using the data type BIT. You can store 8 columns of BIT in the same size that you store a single CHAR(1) column! For all 11 columns of BIT SQL Server will use 2 bytes (SQL Server collects each group of 8 BIT into one byte, which mean that for a single BIT column we uses the same size as CHAR(1) but for 8 columns this become VERY useful and can improve performance dramatically!)

    In short, something looks VERY bad in your design!

    Back to the question and solution for this DDL (as poor as it is, this was the question)

    This is the DDL+DML for the sample

    create table diagnose (ad1 varchar,ad2 varchar,ad3 varchar,ad4 varchar)
    GO
    insert diagnose (ad1,ad2,ad3,ad4) VALUES
    ('y','y','y','y'),
    ('n','y','y','n'),
    ('y','y','y','y'),
    ('n','n','n','y')
    GO

    and here are several solutions with totally different approaches
    AND I CAN FIND MORE THAN 100 DIFFERENT SOLUTIONS IF YOU WANT :-)

    SELECT ad1,ad2,ad3,ad4,
    	-- Option 1: Use only single condition! since condition are expensive
    	CASE WHEN ad1 = 'y' and ad2 = 'y' and ad3 = 'y' and ad4 = 'y' then 'OK' ELSE 'NO' END,
    	-- Option 2: using TRY_CONVERT and REPLACE and simple math
    	-- If the value is 4 then all values where 'y'
    	-- if we have NULL then the SUM of NULL and INT is NULL so the solution is NULL
    	TRY_CONVERT(INT, REPLACE(ad1,'y','1')) + TRY_CONVERT(INT, REPLACE(ad2,'y','1')) + TRY_CONVERT(INT, REPLACE(ad3,'y','1')) + TRY_CONVERT(INT, REPLACE(ad4,'y','1')),
    	-- Option 3: (Great solution) using CONCAT
    	-- if the final value is 'yyyy' then this is OK
    	CONCAT(ad1,ad2,ad3,ad4),
    	CASE WHEN CONCAT(ad1,ad2,ad3,ad4)='yyyy' then 'OK'else'NO' END,
    	-- Option 4: Check the length of the string after removing all 'n'
    	-- if the length is the same as the number of columns then all columns where 'y'
    	-- In this sample if the length is 4 then all columns are 'y' 
    	LEN(REPLACE(CONCAT(ad1,ad2,ad3,ad4), 'n','')),
    	-- Option 4: using BITWISE operator and CONVERT
    	TRY_CONVERT (BIT, REPLACE(ad1,'y',1)) & TRY_CONVERT (BIT, REPLACE(ad2,'y',1)) & TRY_CONVERT (BIT, REPLACE(ad3,'y',1)) & TRY_CONVERT (BIT, REPLACE(ad4,'y',1))
    	-- AND I CAN FIND MORE THAN 100 DIFFERENT SOLUTIONS IF YOU WANT :-)
    FROM diagnose

    Have fun, and I HIGHLY recommend to read what I wrote at the start


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]


    Saturday, February 9, 2019 7:32 AM
    Moderator
  • First of all, read a book on SQL so that you will not confuse the CASE expression with a statement. Next, your specs are ambiguous.; Did you mean they are all equal to "Y" or that one or more of them is equal to "Y"?   

    Finally, you really should not use this kind of encoding in SQL. It looks like you're setting assembly language bit flags. Then there's a linguistic problem as to how you answer yes/no question. Latin, for example, has no yes/no; English responds to the positive form of the question, no matter how it's asked; other languages answer to the exact form in which the question was asked. Also, I will try to find that article on the linguistic problems and post a link for you.

    And there's nothing wrong with a long list of ORs.  

    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Saturday, February 9, 2019 8:00 PM

All replies

  • Hi Lordo,

    Please can you put an example of your query? Not need 15 columns, with 2 or 3 is good enough. Also if you could include an example of your test tables and data, that would be great.

    Thanks,

    Thursday, February 7, 2019 10:07 PM
  • Hi Gurus,

    In my sql i have to compare 15 columns and if they have value ='Y' then its Y else N. I wanted to use the CASE Statement but is there easy way instead of using OR 15 times?

    Thanks

    Good day,

    If you gave us more technical information about your database structure (like queries to create the table) instead of stories(description) then we could understand the scenario much better and probably provide a solution without the need to guess. Please provide the missing information :-)

    In the meantime, If I am guessing correctly you have columns of type string which can get the value "y" or "n" which mean that your database is VERY POORLY designed, and you should re-think about the database architecture.

    If your columns where simple BIT type then the size was much much smaller,the performance was higher, and even the solution for this question was simple math...

    You can use bitwise operator to get the answer directly

    drop table if exists Tbl
    GO
    create table Tbl (c1 bit,c2 bit,c3 bit,c4 bit,c5 bit,c6 bit,c7 bit,c8 bit,c9 bit)
    GO
    
    -- instead of "y" and "n" we use 1 for "y" and 0 for "n"
     insert Tbl (c1,c2,c3,c4,c5,c6,c7,c8,c9) VALUES
    (1,1,1,1,1,1,1,1,1),
    (1,0,1,1,1,1,1,1,1),
    (1,1,1,1,1,0,1,1,1),
    (1,1,1,1,1,1,1,1,1),
    (1,1,1,1,1,1,1,1,1),
    (1,1,1,0,1,1,1,1,1),
    (1,1,1,1,1,1,1,1,1),
    (1,1,1,1,0,1,1,1,1)
    GO
    
    SELECT * FROM Tbl
    GO
    
    --Solution 
    SELECT c1&c2&c3&c4&c5&c6&c7&c8&c9 -- this will be 1 ("y") only if all columns are 1
    from Tbl


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]



    Thursday, February 7, 2019 10:17 PM
    Moderator
  • Thanks Pedro and pituach

    create table diagnose (ad1 varchar,ad2 varchar,ad3 varchar,ad4 varchar)
    GO
    
     insert diagnose (ad1,ad2,ad3,ad4) VALUES
    (y,y,y,y),
    (n,y,y,n),
    (y,y,y,y),
    (n,n,n,y)
    
    GO

    i have15 columns like this , my 16th column is a user created one using the case statement 

    case when ad1='y' or 'ad2='y' or ad3='y' or ad4='y' then 'y' else 'n' end

    Thanks

    Friday, February 8, 2019 2:54 PM
  • Another way:

    create table diagnose (id int identity(1,1),ad1 varchar(1),ad2 varchar(1),ad3 varchar(1),ad4 varchar(1)
    ,ad5 varchar(1),ad6 varchar(1),ad7 varchar(1),ad8 varchar(1) )
    GO
    
     insert diagnose (ad1,ad2,ad3,ad4,ad5,ad6,ad7,ad8) VALUES
    ('y','y','y','n','y','y','y','n'),
    ('y','y','y','y','y','y','y','n'),
    ('n','n','n','y','y','y','y','n'),
    ('y','y','y','y','y','y','y','n')
     
     
     Select Case when Len(concat (ad1,ad2,ad3,ad4,ad5,ad6,ad7,ad8))-Len(Replace( concat (ad1,ad2,ad3,ad4,ad5,ad6,ad7,ad8) ,'y',''))>0 then 'y' else 'n' end as col16 
     from diagnose
     
     
     
     drop table diagnose

    Friday, February 8, 2019 8:26 PM
    Moderator
  • Hi LordofPlains,

    Here you go:

    create table #diagnose (ad1 varchar,ad2 varchar,ad3 varchar,ad4 varchar)
     insert #diagnose (ad1,ad2,ad3,ad4) VALUES
    ('y','y','y','y'),
    ('n','y','y','n'),
    ('y','y','y','y'),
    ('n','n','n','y'),
    ('n','n','n','n')

    select *, col_16=
    case 
    when coalesce(nullif(ad1,'n'),nullif(ad2,'n'),nullif(ad3,'n'),nullif(ad4,'n')) = 'y' then 'y'
    else 'n'
    end
    from #diagnose
    drop table #diagnose

    Saturday, February 9, 2019 3:28 AM
  • Thanks Pedro and pituach

    create table diagnose (ad1 varchar,ad2 varchar,ad3 varchar,ad4 varchar)
    GO
    insert diagnose (ad1,ad2,ad3,ad4) VALUES
    (y,y,y,y),
    (n,y,y,n),
    (y,y,y,y),
    (n,n,n,y)
    
    GO

    i have15 columns like this , my 16th column is a user created one using the case statement 

    case when ad1='y' or 'ad2='y' or ad3='y' or ad4='y' then 'y' else 'n' end

    Thanks

    Hi <?!? would be nice to have a name ?!?>

    This is exactly what I guessed that you are doing, and what I wrote about. It is clear that you should stop everything > take a step back > and think about your database architecture and tables structure! It might be a very good idea at this point (according to this sample of what you do) to take an external consultant to follow your work and designing the system.

    This small code include so many mistakes and misunderstanding about SQL Server that I afraid to think about your entire system performance.

    Let's mentioned two important points, and I really hope for your company sake that you read this:

    1. Using varchar make no sense in most cases

    When you use the type varchar without clarify specific length SQL Server consider it as the default which is length 1. This is the same as using VARCHAR(1)

    If the MAX length is 1 then why the hell do you use VAR?!?

    "For single-byte encoding character sets such as Latin, the storage size is n bytes + 2 bytes"

    When we use VAR, SQL Server must store the length of the data in addition to the data itself, which cost another 2 bytes behind the scenes. This means that in order to store "y" you actually uses 3 bytes, And even in order to store empty string "" you will use 2 bytes
    Using CHAR(1) you will only use 1 byte! Yes... CHAR(1) always uses less size than VARCHAR(1).
    What is the the sense in using VARCHAR(1)?!?
    In fact VARCHAR(n) can be useful in theory only for n > 2

     

    2. Using true/false values

    You want to store "Y"/"N" value which is exactly what I spoke about above! You basically need true/false solution, which in SQL Server is done using the data type BIT. You can store 8 columns of BIT in the same size that you store a single CHAR(1) column! For all 11 columns of BIT SQL Server will use 2 bytes (SQL Server collects each group of 8 BIT into one byte, which mean that for a single BIT column we uses the same size as CHAR(1) but for 8 columns this become VERY useful and can improve performance dramatically!)

    In short, something looks VERY bad in your design!

    Back to the question and solution for this DDL (as poor as it is, this was the question)

    This is the DDL+DML for the sample

    create table diagnose (ad1 varchar,ad2 varchar,ad3 varchar,ad4 varchar)
    GO
    insert diagnose (ad1,ad2,ad3,ad4) VALUES
    ('y','y','y','y'),
    ('n','y','y','n'),
    ('y','y','y','y'),
    ('n','n','n','y')
    GO

    and here are several solutions with totally different approaches
    AND I CAN FIND MORE THAN 100 DIFFERENT SOLUTIONS IF YOU WANT :-)

    SELECT ad1,ad2,ad3,ad4,
    	-- Option 1: Use only single condition! since condition are expensive
    	CASE WHEN ad1 = 'y' and ad2 = 'y' and ad3 = 'y' and ad4 = 'y' then 'OK' ELSE 'NO' END,
    	-- Option 2: using TRY_CONVERT and REPLACE and simple math
    	-- If the value is 4 then all values where 'y'
    	-- if we have NULL then the SUM of NULL and INT is NULL so the solution is NULL
    	TRY_CONVERT(INT, REPLACE(ad1,'y','1')) + TRY_CONVERT(INT, REPLACE(ad2,'y','1')) + TRY_CONVERT(INT, REPLACE(ad3,'y','1')) + TRY_CONVERT(INT, REPLACE(ad4,'y','1')),
    	-- Option 3: (Great solution) using CONCAT
    	-- if the final value is 'yyyy' then this is OK
    	CONCAT(ad1,ad2,ad3,ad4),
    	CASE WHEN CONCAT(ad1,ad2,ad3,ad4)='yyyy' then 'OK'else'NO' END,
    	-- Option 4: Check the length of the string after removing all 'n'
    	-- if the length is the same as the number of columns then all columns where 'y'
    	-- In this sample if the length is 4 then all columns are 'y' 
    	LEN(REPLACE(CONCAT(ad1,ad2,ad3,ad4), 'n','')),
    	-- Option 4: using BITWISE operator and CONVERT
    	TRY_CONVERT (BIT, REPLACE(ad1,'y',1)) & TRY_CONVERT (BIT, REPLACE(ad2,'y',1)) & TRY_CONVERT (BIT, REPLACE(ad3,'y',1)) & TRY_CONVERT (BIT, REPLACE(ad4,'y',1))
    	-- AND I CAN FIND MORE THAN 100 DIFFERENT SOLUTIONS IF YOU WANT :-)
    FROM diagnose

    Have fun, and I HIGHLY recommend to read what I wrote at the start


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]


    Saturday, February 9, 2019 7:32 AM
    Moderator
  • First of all, read a book on SQL so that you will not confuse the CASE expression with a statement. Next, your specs are ambiguous.; Did you mean they are all equal to "Y" or that one or more of them is equal to "Y"?   

    Finally, you really should not use this kind of encoding in SQL. It looks like you're setting assembly language bit flags. Then there's a linguistic problem as to how you answer yes/no question. Latin, for example, has no yes/no; English responds to the positive form of the question, no matter how it's asked; other languages answer to the exact form in which the question was asked. Also, I will try to find that article on the linguistic problems and post a link for you.

    And there's nothing wrong with a long list of ORs.  

    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Saturday, February 9, 2019 8:00 PM
  • Thank You CELKO , pictuah and everybody for the time. My beginner days on SQL , i was just ignorant thinking multiple OR's or AND's might cause some issues to other users who are querying the database, is that true?

    Monday, February 11, 2019 6:31 PM