none
Indexing in a Merge Replication environment RRS feed

  • Question

  • Hello SQL Gurus

    I am new to Database design.  We are building an application that is expected to have mid-level volumes - Around 10K  to 100K records per day based on the day of the week.  It is a retail application and will have multiple databases with a separate database for each branch and one for the central or Head Office.

    We will be using SQL Merge replication to merge data between the branches and the head office.

    I am thinking of having a Primary Key that uses two Int columns - one of which will be the branch ID and another will be a simple sequence.  I am also planning to have this as the clustered index.  Will this design work or will I need to use a GUID as a primary key with the above key as a separate clustered index considering that we will be using SQL Merge Replication. What would be the best primary index for the tables?

    We will be on SQL Server 2016 SP2.

    Any guidance on this will be highly appreciated.

    Thank you all in advance.

    Thursday, July 9, 2020 7:43 AM

All replies

  • Hi Sudarshan PS,

    When choosing Primary Index in Tables consider these:

    1. The column has to be frequently used in the join process.
    2. The column should be that which is frequently used in the where clause during a row selection.
    3. The column should not be frequently changed.

    Hope this could help you.

    Best regards,
    Cathy 

    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to  MSDN Support, feel free to contact MSDNFSF@microsoft.com

    Friday, July 10, 2020 7:50 AM
  • Hi Cathy,

    Thanks for your reply.  Yes, the column I am choosing as the primary index will be the one that will satisfy all the conditions you mentioned. My dilema is on whether this should be a GUID column or a composite index of two INT columns.  This is because data will be entered in multiple source databases and consolidated using MERGE replication.

    Regards

    Sudarshan.

    Friday, July 10, 2020 1:25 PM
  • hi

    GUID is best choice

    Thanks and Regards

    Friday, July 10, 2020 3:20 PM
  • Hi Laxmidhar,

    Any reasons to support this choice.  I have read that GUID being used as a Primary key and Clustered Index can cause severe page fragmentation and have a big impact on performance. Also, there is the question of a GUID taking 4 times as much space as an INT.  In large and high growth databases, this can have an impact on performance and storage space. 

    Having said that, if there are compelling arguments to support GUID as a choice, then I am open to considering it.

    Warm regards

    Sudarshan.

    Saturday, July 11, 2020 1:07 PM
  • Totaly unique between two tables
    Sunday, July 12, 2020 11:03 AM
  • Any reasons to support this choice.  I have read that GUID being used as a Primary key and Clustered Index can cause severe page fragmentation and have a big impact on performance. Also, there is the question of a GUID taking 4 times as much space as an INT.  In large and high growth databases, this can have an impact on performance and storage space. 

    As I recall, Merge Replication requires all tables to have a ROWGUIDCOL and which must be marked as unique, so you cannot escape the GUID.

    But that does not mean that the GUID has to be the PK. I have never used Merge Replication myself, so I cannot talk much about best practice in that perspective.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Sunday, July 12, 2020 11:39 AM
  • Hi Erland Sommarskog,

    Thanks for your input.  I have also read that MERGE replication creates a GUID column if we don't have an indentified GUID column in our tables.  

    My question was to get some pointers on if it is a good idea to have GUIDs as Primary Keys.  I have read that having GUID as clustered indexes are very bad for performance due to inherent issues with clustering on a character column and also due to a very high degree of fragmentation when we use GUID as a CI. 

    So, was wondering if it would make sense to have a GUID as a PK and a two column combined INT as a CI.

    Warm regards

    Sudarshan. 

    Monday, July 13, 2020 7:29 AM
  • So, was wondering if it would make sense to have a GUID as a PK and a two column combined INT as a CI.

    That's very difficult to answer without knowing the details of your business.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Monday, July 13, 2020 8:38 AM