none
FAST Server - WebAnalyzer, total size of files is huge RRS feed

  • Question

  • Hello All,

    We are running Fast Search for Sharepoint 2010 in our environment. On the FAST server, I'm seeing that the WebAnalyzer is writing TMP files (am thinking logs) agressively and is causing the disk space to run out quickly. The total size of all these files together is 100s of GBs. The location of these TMP files on the FAST server is, ..\FASTSearch\data\webanalyzer\waworker1\build_views\default.



    Here are my questions:

    1. Can I delete these TMP files safely? Is there a procedure I should be following or can I just delete all files in this directory in the normal fashion?

    2. What is the purpose of these files? Are they required for the proper functioning of FAST? Or are these just logs?

    3. Is it possible to have FAST stop writing these files? I'm not sure if it is a service or configuration that can be turned off, which will inturn stop the creation of these files.



    Please share your experience/knowledge on this topic. Thanks in advance.

    Regards.



    Wednesday, March 14, 2012 10:56 PM

Answers

  • This is surprising, the only way I can explain is that WebAnalyzer is not catching up with constant feeding and view processing and although it deletes the older TMP files, it keeps creating new ones at a higher rate.  Otherwise, we might be looking at something not being right on your system.  So, I would:

    1)  Collect all Webanalyzer-related logs from all nodes where Webanalyzer or its components(FDMWorker, walinkstorerreceiver, walookupdb) might be running, it would be something like this:

    %FASTSEARCH\var\log\webanalyzer
    %FASTSEARCH\var\log\fdmworker
    %FASTSEARCH\var\log\walinkstorerreceiver

    %FASTSEARCH\var\log\wapartialupdater

    It's quite a lot of logs and you may want to have a support ticket open for further analysis, but you can at least quickly take a look at webanalyzer logs, specifically under \var\log\webanlayzer directory, proc.log and webanalyzer.log.  Anything jumps out during the time you've had this growth?

    Alternatively, are you using webanalyzer extensively for relevency/ranking calculations and do you need the most up-to-date analysis?  If not, you can pause(or potentially abort) processing of all of your views, which should stop webanalyzer from doing sorting operations that generate TMP files during a processing run.  That may give Webanalyzer a chance to catch up and eventually delete all TMP's, if that's the real problem.

    You would have to run something like this:

    >>waadmin PauseProcessing

    Description:

    PauseProcessing will pause ongoing processing on current view. 

    Or you may even need to AbortProcessing, which means that unlike doing PauseProcesing, processing of the view would have to be re-started.

    >>waadmin AbortProcessing -n <viewname>


    Igor Veytskin

    Tuesday, March 20, 2012 7:17 PM
    Moderator

All replies

  • Hi,

    First, you need to be aware that one of the biggest requirements for WebAnalyzer is disk space and you need to plan for that accordingly and make sure enough disk space is allocated during design stage of the project.

    http://technet.microsoft.com/en-us/library/gg604780.aspx

    will tell you that we are looking at the following numbers for disk space sizing for WA:

    Content Source:                             Number of items per CPU core:               GB disk per million items                 

    Database

    20 million

                                                  2

    SharePoint Server 2010 / Intranet

    10 million

                                                  6

    Public web content

    5 million

                                                  25

    So if you are looking at the worst case scenario, you are looking at 25GB of disk space for each million items, although it may fluctuate depending on specifics.  You may also end up needed more disk space allocated due to these TMP files, as they are temporary.

    To answer your questions:

    1) The TMP files cannot be deleted manually. No files can be deleted manually from the folders under build_collection and build_views. The only way is to remove collections (and WebAnalyzer views if you have created any)

    2) The TMP files are generated when WA sorts data sets that don't fit into memory. Several sorting operations will be started during a processing run. They are requried by WebAnalyzer/SPRel. They contain data used during the analysis, they are not log data.

    3) No, this is not possible, but FAST will delete these files after a while, when the sort operation has completed. Stopping WA/SPRel will stop the creation of these files, but that will also impact relevance/recall.


    Igor Veytskin

    Thursday, March 15, 2012 2:03 PM
    Moderator
  • Thanks for the response Igor. While I decide on the next steps, I only wanted to add a few more details.



    On my end,

    1. I deleted a couple of content sources from Sharepoint that contained close to 1.4million items.

    2. I then ran the Clear-FASTSearchContentCollection operation on the FAST server, to clean out the collection that the content source was mapped to. Now the collection is empty.



    These actions were performed with an intent to free up some disk space as we were noticing problems due to low disk space availability. However to my surprise the activity in this directory (..\FASTSearch\data\webanalyzer\waworker1\build_views\default) is huge and it now contains almost 6800 files of total size 690 GB. Also whenever we restart the FAST Search service or perform any action, we see that more such files are created, each being 100MB in size. The rate at which it gets created is 7 files per min and I'm not seeing them getting deleted eventually. Right now the directory has 6854 files.  



    All the above said, what would you recommend as a next step for me to address this issue? Since you mention FAST will delete these files after a while, I wanted to know how soon and/or how often does the deletion of files by FAST happen? Look forward to your assistance. Thanks.



    Note - I've also posted another thread here, http://social.technet.microsoft.com/Forums/en-US/fastinternetesp/thread/b85dabbf-9a07-4852-8fc9-61a82eacca9c


    Regards.


    Thursday, March 15, 2012 10:02 PM
  • This is surprising, the only way I can explain is that WebAnalyzer is not catching up with constant feeding and view processing and although it deletes the older TMP files, it keeps creating new ones at a higher rate.  Otherwise, we might be looking at something not being right on your system.  So, I would:

    1)  Collect all Webanalyzer-related logs from all nodes where Webanalyzer or its components(FDMWorker, walinkstorerreceiver, walookupdb) might be running, it would be something like this:

    %FASTSEARCH\var\log\webanalyzer
    %FASTSEARCH\var\log\fdmworker
    %FASTSEARCH\var\log\walinkstorerreceiver

    %FASTSEARCH\var\log\wapartialupdater

    It's quite a lot of logs and you may want to have a support ticket open for further analysis, but you can at least quickly take a look at webanalyzer logs, specifically under \var\log\webanlayzer directory, proc.log and webanalyzer.log.  Anything jumps out during the time you've had this growth?

    Alternatively, are you using webanalyzer extensively for relevency/ranking calculations and do you need the most up-to-date analysis?  If not, you can pause(or potentially abort) processing of all of your views, which should stop webanalyzer from doing sorting operations that generate TMP files during a processing run.  That may give Webanalyzer a chance to catch up and eventually delete all TMP's, if that's the real problem.

    You would have to run something like this:

    >>waadmin PauseProcessing

    Description:

    PauseProcessing will pause ongoing processing on current view. 

    Or you may even need to AbortProcessing, which means that unlike doing PauseProcesing, processing of the view would have to be re-started.

    >>waadmin AbortProcessing -n <viewname>


    Igor Veytskin

    Tuesday, March 20, 2012 7:17 PM
    Moderator