locked
How to generate a report of all docs in sharepoint with metadata? RRS feed

  • Question

  • Hello,

     

    In my SharePoint farm(MOSS 2007) all the documents uploaded are content type tagged. All content types has fields ABC, PQR and XYZ but users sometimes doesnt enter values in this fields while they upload documents. Now i have to generate a report showing how many documents are there in farm, how many of them have value in the fields ABC, PQR and XYZ. Which site this document belongs to, URL of the document library etc. Note that we have around 24 content databases and 12000+ site collections. Can someone help me in how to proceed with this requirement? is there a way to query content database directly to get these information?
    Thanks in advance

    Mathew

    aju
    • Edited by Mike Walsh FIN Thursday, March 3, 2011 9:08 AM Need changed to How+? Do not ask for quick help in Texts - these are forums, post and wait.
    Wednesday, March 2, 2011 12:38 PM

Answers

  • Hello again,

    You don't have to loop and check metadata for every single document!

     SPSiteDataQuery wil help you determine which docs in aSsite Collection of Content Type XX have specific metadata filled or not.

    I also think (have not tested it) that by using recursive attribute in your CAML query will reduce the number of the Site Collections that your query has to be executed against.

     

    Another possible solution would be to focus on a  Content Query Web Part...

    But in this case you should  Log On as a user with read access Everywhere in your intranet.

     

    How many web apps does your intranet contain ??


    Dimitris Porikos, MCTS
    • Marked as answer by Peng Lei Thursday, March 10, 2011 7:18 AM
    Wednesday, March 2, 2011 3:44 PM
  • Hello,

    In the example below I get all documents (BaseType = 1) from our dev site collection having a specific content type and a specifig value in a field.

    SPSiteDataQuery q = new SPSiteDataQuery();
    q.Lists = "<Lists BaseType ='1'/>";
    q.Query = "<Where><And><Eq><FieldRef Name='ContentType' /><Value Type='Text'>ContentTypeName</Value>"
            + "</Eq><Eq><FieldRef Name='TextFieldName' /><Value Type='Text'>CustomValue</Value></Eq></And></Where>";
    
    q.Webs = "<Webs Scope='SiteCollection' />";
    q.ViewFields = "<FieldRef Name='Title' /><FieldRef Name='ID' />";
    DataTable dt = null;
    using (SPSite s = new SPSite("http://dev.portal.com"))
          {
            SPWeb w = s.OpenWeb();
            dt = w.GetSiteData(q);
            w = null;
          }
    
    dataGridView1.DataSource = dt.DefaultView;
    MessageBox.Show("Finished Query.." + dt.Rows.Count.ToString());
    

     

    I hope that helps.

    D.


    Dimitris Porikos, MCTS
    • Marked as answer by Peng Lei Thursday, March 10, 2011 7:18 AM
    Tuesday, March 8, 2011 9:24 AM
  • hello

    SPSiteDataQuery is most suitable tool for that. I would like to add one more note which will be useful if you will use SPSiteDataQuery: ViewFields property affects of how many items will be returned. By default if you specify some field in the ViewFields, but this field is empty or is not in the list, item will not be returned at all. In order to return items regardless of the ViewField, you will need to specify Nullable="TRUE" for <FieldRef> inside view fields.


    Blog - http://sadomovalex.blogspot.com
    CAML via C# - http://camlex.codeplex.com
    • Marked as answer by Peng Lei Thursday, March 10, 2011 7:18 AM
    Tuesday, March 8, 2011 12:23 PM

All replies

  • Mathew,

    I think what you are trying to do would need some custom code. You can write an application where you can use the SPSiteDataQuery Class: http://msdn.microsoft.com/en-us/library/ms409088(v=office.12).aspx

    By using this class you can create CAML queries against a site collection and view the results. So, I think that the idea would be to loop through every Web Application and every Site Collection (maybe only the Top Leve Site Collections (??) - since there is property I think which can be used to search recursively)  programmatically, and store this information to a DB, XML or any other output format.

    In order to construct your CAML query you can make use of a free CAML Builder tool by U2U: http://www.u2u.net/res/Tools/CamlQueryBuilder.aspx


    Dimitris Porikos, MCTS
    Wednesday, March 2, 2011 2:24 PM
  • Hello Dimitris,

     

    Thank you for your reply. I was also thinking of using SPSiteDataQuery but we have around 13000 site collections and i have to loop through every single document in each site collection and get the meta data field of each document which will take days to complete and will affect the performance badly. And each time the management need a report this has to be repeated. :(

     

    Any other ideas?

     

    Thanks,

    Mathew


    aju
    Wednesday, March 2, 2011 3:29 PM
  • Hello again,

    You don't have to loop and check metadata for every single document!

     SPSiteDataQuery wil help you determine which docs in aSsite Collection of Content Type XX have specific metadata filled or not.

    I also think (have not tested it) that by using recursive attribute in your CAML query will reduce the number of the Site Collections that your query has to be executed against.

     

    Another possible solution would be to focus on a  Content Query Web Part...

    But in this case you should  Log On as a user with read access Everywhere in your intranet.

     

    How many web apps does your intranet contain ??


    Dimitris Porikos, MCTS
    • Marked as answer by Peng Lei Thursday, March 10, 2011 7:18 AM
    Wednesday, March 2, 2011 3:44 PM
  • hi Dimitris,

     

    i have only one webapp, do you have an example how "SPSiteDataQuery wil help you determine which docs in aSsite Collection of Content Type XX have specific metadata filled or not." ?


    aju
    Wednesday, March 2, 2011 4:32 PM
  • Hello,

    In the example below I get all documents (BaseType = 1) from our dev site collection having a specific content type and a specifig value in a field.

    SPSiteDataQuery q = new SPSiteDataQuery();
    q.Lists = "<Lists BaseType ='1'/>";
    q.Query = "<Where><And><Eq><FieldRef Name='ContentType' /><Value Type='Text'>ContentTypeName</Value>"
            + "</Eq><Eq><FieldRef Name='TextFieldName' /><Value Type='Text'>CustomValue</Value></Eq></And></Where>";
    
    q.Webs = "<Webs Scope='SiteCollection' />";
    q.ViewFields = "<FieldRef Name='Title' /><FieldRef Name='ID' />";
    DataTable dt = null;
    using (SPSite s = new SPSite("http://dev.portal.com"))
          {
            SPWeb w = s.OpenWeb();
            dt = w.GetSiteData(q);
            w = null;
          }
    
    dataGridView1.DataSource = dt.DefaultView;
    MessageBox.Show("Finished Query.." + dt.Rows.Count.ToString());
    

     

    I hope that helps.

    D.


    Dimitris Porikos, MCTS
    • Marked as answer by Peng Lei Thursday, March 10, 2011 7:18 AM
    Tuesday, March 8, 2011 9:24 AM
  • hello

    SPSiteDataQuery is most suitable tool for that. I would like to add one more note which will be useful if you will use SPSiteDataQuery: ViewFields property affects of how many items will be returned. By default if you specify some field in the ViewFields, but this field is empty or is not in the list, item will not be returned at all. In order to return items regardless of the ViewField, you will need to specify Nullable="TRUE" for <FieldRef> inside view fields.


    Blog - http://sadomovalex.blogspot.com
    CAML via C# - http://camlex.codeplex.com
    • Marked as answer by Peng Lei Thursday, March 10, 2011 7:18 AM
    Tuesday, March 8, 2011 12:23 PM