MOSS Content Database Growing Very Large
-
Monday, June 09, 2008 5:30 PM
I have received the following error (over a period of days):
Event Type: Error
Event Source: Office Server Search
Event Category: Gatherer
Event ID: 10027
Date: 6/9/2008
Time: 10:11:32 AM
User: N/A
Computer: MOSSDV1
Description:
Failed to update committed transaction in SQL, DocID is 26211786.Context: Application 'SharedServices1', Catalog 'ProfileImport'
Details:
One or more errors occurred during processing of command. (0x80040e14)For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Not sure what caused the error; however, after a machine reboot, I no longer gets these errors in the event log.
If I try to "Reset all crawled content" or just "Stop" incremental crawls, it looks like these functions just time out or hang.
Our Sharedservices 1 search database is now really large - about 10 times the size it should be.
Any ideas on how to reduce the size of the database?
Answers
-
Friday, June 20, 2008 7:02 AMModerator
There must incorrect operations during daily work with your Dev instance. I suggest you to do these steps:
1. create a new SSP.
2. delete and recreate a search database.
3. perform a full crawl and user profile import.
-lambert -
Monday, June 23, 2008 4:06 PM
The account I was using was an admin account.
After reviewing the issue more, stsadm was a 0 byte file size. I'm not sure how it got like this. I copied stsadm.exe from another installation and tried it again. I got the error "command line error". I'm not sure what I had wrong as I was using the command line in this post.
I decided to try using the GUI again. The site was still marked as "unprovisioning". This time (with more disk space added), it deleted SSP1.
It looks like I had an SSP that was corrupted somehow. Now I can monitor disk space and see if the new SSP has the same problem. After 3 days however, it has not grown appreciably different than production.
The basic steps to resolve the problem was:
1. Create a new SSP in a new application pool.
2. Associate all web sites applicable to the new SSP
3. Change the default application pool to the new SSP
4. Delete the old SSP.
All Replies
-
Wednesday, June 11, 2008 6:28 AM
You can back up and shrink the transaction log file by using the following commands in SQL Server Management Studio.
BACKUP LOG <DatabaseName> TO DISK = '<BackupFile>'
DBCC SHRINKFILE (<FileName>, <TargetSize>) WITH NO_INFOMSGS
you also can view these two KBs to get more information.
How to use the DBCC SHRINKFILE statement to shrink the transaction log file in SQL Server 2005
How to stop the transaction log of a SQL Server database from growing unexpectedly
-lambert
-
Wednesday, June 11, 2008 3:08 PM
It is not the transaction log file that is getting large, it is the following databases:
MSSCrawlURL (77 GB)
MSSCrawlQueue (5.9 GB)
These are about 10 times larger than they are in a production copy that we have (which contains a copy of the production data - about the same amount of data).
It looks like it just keeps crawling and adding information into these tables. Any ideas how to stop this? It basically just keeps growing these tables. Reseting crawled content seems to fail after a long period of time. I've tried to exclude all urls that we are crawling. We only have about 5,000 items to crawl.
-
Thursday, June 12, 2008 8:30 AM
You can try to re-schedule the search plan. And you can modify it in “Shared Services Administration > Search Settings > Content sources and crawl schedules”, but in the page if your status is always “crawling full”, you may need to view this article and try to solve it.
-lambert -
Thursday, June 12, 2008 5:45 PM
I see "Incremental Crawl" seems to be the status. I checked the indexes per the article you referred to; however, mine appear to be set up correctly.
Is there a way just to create a new SSP and move to it (say SharedServices2)? Can I then make this my default and delete SharedServices1? I did this once in the past but it failed and I had to rebuild MOSS 2007 from scratch. Just trying to get some hints from others that have done this.
I can see that you can go to Cetral Administration > Application Management > Manage this farm's Shared Services:
Create a New SSP
Change Default SSP
Will this create a new SSP, new search indexes, etc. and possibly resolve this problem? -
Friday, June 13, 2008 5:48 AM
I think most time the status is idle, so you can check whether the Incremental Crawl frequency is too often.
And try to use “DBCC SHRINKFILE (<FileName>, <TargetSize>) WITH NO_INFOMSGS” in SQL Server, this command is not only can shrink log file but also the database file.
Yes, you can create a new SSP and change the association of current web application to the new SSP. whether Change Default SSP is based your needs. I think this way is the last resort.
-lambert
-
Friday, June 13, 2008 6:25 PM
Thanks for the "DBCC SHRINKFILE" suggestion; however, this is accomplished at the file level, which is not the same as table level in this case. If we did run it or could run it on the file that contains the tables in question there is no free space to remove which is what this command does.
We have already applied hotfix KB946517. This is a 64 bit installation. We also have applied SP1 for WSS and MOSS 2007.
I have seen similar things in other blogs:
1. http://blogs.msdn.com/gregmcb/archive/2008/03/03/index-grows-out-of-control.aspx
Index Grows Out of Control
I have seen on two ocassions where the index entries in MSSCrawlURL table grew from 39K to 13 million entries. While I have not figured out the root cuase, the workaround is to just rebuild your index.
How do we rebuild the index?
2. http://admincompanion.mindsharp.com/BillBlog/Lists/Posts/Post.aspx?ID=83
Cleaning Out the Crawl Queue
There are three ways that the crawl queue can be cleared out. The first way – and the most often used method – is the successful completion of a crawl effort on a content source. The start addresses in the content source are said to have been fully crawled when all of the URLs in the crawl queue have been processed and crawled. The second method is to stop the crawl process. When you do this, the content source enters a state of "stopping" and remains in this state until all of the URLs have been emptied from the queue. If the queue has a large number of URLs in it (100,000 or more), it may take a number of minutes or hours (depending on the SQL Server's available resources) to completely empty the queue. Thirdly, resetting the index will clear out the queue. In my testing, I have seen orphaned URLs remain in the MSSCrawlerQueue and the only way to remove them is to reset the index.
Again, How do we rebuild the index?
stsadm -o osearch -action stop
Warning: The Office SharePoint Server Search service that was enabled on this server will be uninstalled and all index files for all SSPs will be deleted. Are you sure you want to permanently delete all data in the index for all SSPs?
stsadm -o osearch -action start -role indexQuery
This will stop the microsoft office search service and restart it.
The warning leads me to believe that stopping this would uninstall all index files for the SSP. I'm not sure I want to do this. Where can I learn more about what these commands are actually doing?
1. Go to SQL Server database SharedServices1_Search_DB and delete the records from all the tables with crawlID column (ex. CrawlHistory table).
2. This will stop the crawl
3. Create a new SharedService with new database using SharePoint Centeral Admin
4. Set new Shared Service to default
5. Delete the old Shared Service
6. Setup the crawl rules and run the full crawlThis looks like the steps if I was going to create a new SSP. Any thoughts on this process?
-
Friday, June 13, 2008 6:29 PM
A little more information that might help me resolve this issue:
Here are row counts on the first 15 characters of the MSSCrawlURL.accessurl values. As you can see the stuff in spsimport is what’s out of whack.
spsimport://7e4 56900092
When I look at Shared Services Administration: SharedServices1 > User Profile and Properties
I see:
Membership & BDC import status: Importing (1,523,572) Somehow our ADS import got messed up (last successful import was)
Import time: Started full import at 4/25/2008 11:01 PM I see 28,000,000 successful imports (we have about 600 users).
Many of the successful results look like garbage: spsimport://?$$dl$$/xlcstlwf/...tlwf/xtradv/xlcstlwf...
It basically times out when I try to stop this import. Any ideas how to stop this import?
Any ideas how to clean up the indexes (maybe stsadm commands)?
-
Monday, June 16, 2008 9:14 AMModerator
To rebuild index, look at this article, but I have no idea why so many user profile imported.
If I find something may help, I will let you know.
-lambert
-
Monday, June 16, 2008 9:18 PM
I have looked into the content of some of the tables. It looks like I have a lot of garbage in there from an import of AD users that apparently did not work. The garbage is in a format like: spsimport://domain?$$dl$$. I've seen another post like this: http://dotnetmafia.com/blogs/dotnettipoftheday/archive/2008/04/21/profile-import-error-the-specified-domain-either-does-not-exist-or-could-not-be-contacted-exception-from-hresult-0x8007054b.aspx
What I really want to do is to delete all of the queue and search index and have it start over. Basically, I want to reset the crawl index. I have tried this via the browser interface but it just times out or never completes the task. I don't see the number of records reduce in the tables either. My queue hangs around 15-24 Million records and the url table has 60 Million records for a small amount of documents.
I looked into some command line options like:
stsadm -o osearch -action stop
however, when I do this command, it gives me the following warning message:
"Warning: The Office SharePoint Server Search service that was enabled on this server will be uninstalled and all index files will be deleted. Are you sure you want to permanently delete all data in the index? [y/n]"
Is this the correct command to stop the search queue? I don't really want to uninstall anything. Do I have to reinstall something to start searching again? I do want to delete my search indexes and then have the search service start from scratch.
I thought that I would then do the following after the search service was stopped:
stsadm -o osearch -action start
I'm not really sure if the stop clears the tables or not. I'm using the following as my reference: http://technet.microsoft.com/en-us/library/cc262920(TechNet.10).aspx
Any ideas how to make the bad AD content source stop trying to index? I've deleted it at this point trying to make it stop.
-
Thursday, June 19, 2008 4:46 PM
My mistake, looks like the Dev instance is 20000 times bigger with the same basic content.
Dev
MSSCrawlURL: 60616166 (83.97 GB)
MSSCrawlURLLog: 7058214 (7.98 GB)
MSSCrawQueue: 21458972 (4.63 GB)
Prod
MSSCrawlURL: 4775 (4.3 MB)
MSSCrawlURLLog: 38 (24 KB)
MSSCrawQueue: 0 (248 KB)
Any feedback on this idea from above?
stsadm -o osearch -action stop
stsadm -o osearch -action start
How about the process to create a new SSP idea?
-
Friday, June 20, 2008 7:02 AMModerator
There must incorrect operations during daily work with your Dev instance. I suggest you to do these steps:
1. create a new SSP.
2. delete and recreate a search database.
3. perform a full crawl and user profile import.
-lambert -
Friday, June 20, 2008 2:31 PM
I created a new SSP (SharedServices2) in a new application pool. I made sure all of the associations were on the new application pool. I changed the default application pool to be SharedServices2.
I started a crawl with SSP2 and ran out of disk space on the database box. I tried to delete the SSP1 to get back some disk space but it failed with a generic WSS3.0 error message. I added disk space.
Now when I go to Central Admin and look at my SSPs, SharedServices1 is marked as (unprovisioning).
I found a link that covers this case:
https://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=1920981&SiteID=17
I tried the following command recommended by the post:
stsadm -o deletessp -title SharedServices1 -force
I get an error dialog that says stsadm.exe is not a valid Win32 application.
in the command window, after "stsadm -o deletessp -title SharedServices1 -force"
I see
"Access is denied."
Any ideas on what access is needed or who I need to run this command as? I am running this as a site collection admin.
-
Monday, June 23, 2008 1:37 AMModerator
you need local administrator right, try to add you account to buildin administrators group
-lambert
-
Monday, June 23, 2008 4:06 PM
The account I was using was an admin account.
After reviewing the issue more, stsadm was a 0 byte file size. I'm not sure how it got like this. I copied stsadm.exe from another installation and tried it again. I got the error "command line error". I'm not sure what I had wrong as I was using the command line in this post.
I decided to try using the GUI again. The site was still marked as "unprovisioning". This time (with more disk space added), it deleted SSP1.
It looks like I had an SSP that was corrupted somehow. Now I can monitor disk space and see if the new SSP has the same problem. After 3 days however, it has not grown appreciably different than production.
The basic steps to resolve the problem was:
1. Create a new SSP in a new application pool.
2. Associate all web sites applicable to the new SSP
3. Change the default application pool to the new SSP
4. Delete the old SSP.
-
Monday, July 14, 2008 3:15 PM
It could be your system is completely out of HD space?
- Delete the databases from within SQL Server Management studio. Search databases are expendable, you don't need a backup.
- Delete all other unnecessary data from the HD of the server (tempfiles etc.) you might have
- check all other databases: check their size and % effective size. Shrink the logfiles of all databases and free up the space.
When you recreated the SSP and everything works again:
make sure your search db size is limited below the space available on disk. Never let it run out of diskspace. If a logfile wants to be bigger than the size it has set, it can't complete some actions but it will recover. If the diskspace is running out it's unpredictable what's going to happen.
If the two search databases are unequal, it might be that seach rules are prevented on prod but not on dev or vice versa. F.I. (also) use a read only search account on dev, you surely need a readonly search account on prod.
-
Tuesday, September 15, 2009 2:19 PMHi
We did as you specified but the problem came back after about a month.
Database keeps adding a line for every childdomain we have...
spsimport://e5e5c853-9788-4b3d-ba5e-523a1b9e6bd8/?$$dl$$/am.domain.se/no.domain.se/am.domain.se/no.domain.se/no.domain.se/fi.domain.se/fi.domain.se/fi.domain.se/no.domain.se/fi.domain.se........
and it keeps going... I stoped the "Membership & BDC import" at about 1100000 lines.
Did your problem return?
regards
Robban -
Friday, June 18, 2010 9:22 AM
We have a situation now on prod in sharepoint 2007 based intranet platform and it shows thousands of records under people_dl_import category with format spsimport://?$$dl$$/domain1/domain2/domain3/ Also import was not stopping and added millions of records in database and was on verge of disk full.
On other servers like dev we have very less data in this category and format is also like spsimport://doaminname?$$dl$$?... which is good and has only 6000 rows and in prod its 2 millions Crawled under people_dl_import category. I need to know the cause of this garbage data and how to fix it. I tried resetting content source and I will do full import in this weekend to see if this garbage data gets cleared. Any idea on cause for thiss issue?

