none
Database to store PDF docs and became very large DB un manageable RRS feed

  • Question

  • Hi

    We have database 2016 sp2 EE. Our business  some customer data available PDF format. Therefore we need a plan to store PDFs either with in DB or outside the DB with reference number in DB level.

    Consideration :

    PDF size per doc:2MB, number of PDF per day 4000, per day size then:8GB ,yearly 8*365=2.9T , 2 years DB = 5.8T, 3 years = 8.7 T

    This database enabled always on two nodes 

    Database will grow massive in two years times

    option1> storing Those PDF in DB level var-binary(max) with year column , with table partitions, data retrieve may may faster than opt3  

    option2> store  Those PDF in DB  with filestrem, data retrieve may may faster than opt3

    option3>. store outside and given reference in DB level(less growth)

     

    Q1 If we select option 1&2 the DB will grow rapidly. due to large size there might have data lag between replicas. Due to large size data refresh to non prod env from prod will take unexpected time line. So what would be the solution for this

    Q2. if select option1 &2 how to manage the  very large DB  with number or data files , partition options, 

    Q3:when select option 1 &2, if we plan to have refresh non prod from prod from every day, then expected   time to complete is high. so how do we manage it.

    Much appreciate expert advise on this. 

    Thanks

     

     

     
    • Edited by ashwan Sunday, October 20, 2019 9:40 PM
    Sunday, October 20, 2019 1:58 AM

All replies

  • Hi ashwan,

     

    I suggest you use filestream. It stores files in a shared folder, and both primary and secondary replica are accessible, which does not cause excessive data file growth. For more details, please refer to https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/filestream-and-filetable-with-always-on-availability-groups-sql-server?view=sql-server-ver15

     

    Best regards,

    Dedmon Dai


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    Monday, October 21, 2019 8:14 AM
  • Hi Dedmon,

    Thank you for the reply as its complicated question . 

    Q1, Is that DB size will included with filestream when restore the DB to other node every day? is that mean daily refresh to non prod from prod, does this filestream  datafiles must included or not. Are there any way to reduce refresh time. (Expecting to resfresh mutilple copies per day)

    ex PROD   -> PRODTEST 

                        PRODDEV

                        PRODTEST

    Q2: When calculating DB size, should we need to include datafiles of filstream? or not  

    regards


    • Edited by ashwan Tuesday, October 22, 2019 3:07 AM
    Tuesday, October 22, 2019 2:53 AM
  • >>Q1, Is that DB size will included with filestream when restore the DB to other node every day? is that mean daily refresh to non prod from prod, does this filestream  datafiles must included or not. Are there any way to reduce refresh time. (Expecting to resfresh mutilple copies per day)

     

    Filestream just stores the unstructured data in the file system, so the size of the database also contains these unstructured data. When you restore the filestream enabled database, you will find that you also need to restore the filestream file.

     

    In always on, filestream is placed on a shared disk. When you enable FILESTREAM on an instance of SQL Server, an instance-level share is created to provide access to FILESTREAM data. After a failover, FILESTREAM data is accessible on both the readable secondary replica and the new primary replica.So the synchronization time won't be long because the data synchronization between the replicas does not contain these unstructured data.

     

    >>Q2: When calculating DB size, should we need to include datafiles of filstream? or not 

     

    Yes, you need to. As I said above, these unstructured data also belong to the database.


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    Tuesday, October 22, 2019 3:12 PM
  • Hi Dedmon Great thanks Will you able to provide note reference steps  to mount shared disk to alwayson servers  if possible. But end of the days this does not help(time taken to restore) to restore backup to non prod env(refersh) as remain unpredictable large size DB then.

    regards

       


    • Edited by ashwan Wednesday, October 23, 2019 10:21 PM
    Tuesday, October 22, 2019 10:36 PM