locked
Insufficient system resources exist to complete the requested service RRS feed

  • Question


  • [I did intend to start this post with a screenshot of the above error when I initiate the transfer from Windows Explorer, but apparently 'Body text cannot contain images or links until we are able to verify your account.' so I will just have to do some typing, viz the error dialog says:

     'An unexpected error is keeping you from copying the file. If you continue to receive this error, you can use the error code to search for help with this problem.

    Error 0x800705AA: Insufficient system resources exist to complete the requested service.'

    I get this error pretty much 100% of the time from one particular PC when trying to copy a folder of 10 2GB files to a server with both mirror and parity storage spaces.

    I recently purchased a Thecus W5000 running Windows Storage Server 2012 R2 Essentials. Absent any guidance either way I decided to set up a storage pool across the three 3TB WD Red drives that I have installed in it and to allocate 1.5TB of that space to a mirror storage space and the remainder to a parity storage space. Having read some faily dire things about storage spaces, but wanting the resilience provided by those two types of storage space, I decided to run some benchmarking tests before finalising anything.

    To that end I only went as far through the Essentials setup as creating a handful of user accounts before setting up the storage spaces and sharing both of them, with all authenticated users permitted full control. My benchmarking consists of a Take Command batch file timing three large directory copies - one with 10 2GB files, one with 10240 10K files and another with a multi-level directory with a variety of files of differing sizes. The first two are completely artificial and the latter is a real world example but all are roughly 20GB total size.

    To test various aspects of this I copied the three structures to and then from a partition created on the internal disk (the W5000 has a 500GB SSHD) and to the two storage space partitions. I also created a version of the batch file for use internally which did something similar between the internal disk and the two storage space partitions, and another as a control that tested the same process between the two Windows PCs. The internal test ran to successful completion, as did the PC to PC copy and the external one from my Windows 8.1 64-bit system (i5 3570K, 16GB RAM, 1TB HD) but when I ran it from my Windows 7 Pro 64-bit gaming rig (i7 2600K, 8GB RAM, 1TB HD) I got a number of failures with this error from Take Command:

    TCC: (Sys) C:\Program Files\bat\thecus_test_pass.btm [31] Insufficient system resources exist to complete the requested service.

    (where line 31 of that batch file is a copy command from local D: to the parity space on the Thecus).

    The error occurs only when copying large files (the 2GB ones already mentioned but some of those in the real world structure that are about 750MB in size) from the Win7 system to the Thecus and only when doing so to the storage space volumes - ie. copying to the internal disk works fine, copying from all volumes works fine, copying internally within the Thecus works fine, copying between the Win8 and Win7 machines works fine and initiating the copy as a pull from the server between the same two disks also works fine. One aspect of this that surprised me somewhat was just how quickly the copy fails when initiated from Windows Explorer - checking out the details section of the copy dialog I see roughly ten seconds of setting up and then within five seconds after the first file transfer is shown as starting the error dialog pops up (as per the image no longer at the top of this post).

    There are no entries in the event log on either machine related to this error and I've had the System Information window of the Sysinternals Process Explorer up and running on both machines whilst testing this, and it shows nothing surprising on either side. I've also run with an xperf base active and I can't see anything pertinent in the output from either system.

    Frankly, I am at a loss and have no idea what other troubleshooting steps I should try. The vast majority of the existing advice for this error message seems to relate to Windows 2003 and memory pools - which both the fact that this works from one PC but not the other and the SysInfo/xperf output seems to suggest is not the issue. The other thing I've seen mentioned is IRPStackSize, but again if that was the problem I would expect the failure to occur where ever I initiated the large file transfer from.



    • Edited by CessPitts Friday, January 23, 2015 9:11 AM Added note about event logs
    Friday, January 23, 2015 8:48 AM

All replies

  • Whilst I ponder... Do you have a 3rd party AV on the gaming rig?

    Grey

    Friday, January 23, 2015 2:38 PM
  • Do you have a 3rd party AV on the gaming rig?

    Grey

    Grey,

    I have AVG IS 2014 on both PCs but nothing on the Thecus.

    BTW where do I go to set a signature?? It may be an idiot user moment but my profile does not seem to contain a signature field.

    Cheers, Steve


    • Edited by CessPitts Friday, January 23, 2015 5:24 PM
    Friday, January 23, 2015 5:24 PM
  • You can also post pics to onedrive etc and post a link.  Plain old resource monitor on both machines showing ram and network? 

    I dunno, in the past low ram just meant slow, but things still worked.  Your gaming beast is just over powering the 2gb ram in the server?  Cripple the NIC in the beast to 100mbs just to see?

    I guess I can assign 2gb ram to a VM and see what happens.  If its not to much of a burden can you uninstall AVG to test?  I know, I know.  You be surprised at how many times they tell us uninstalling an AV (Not AVG), not just disabling it, "fixes" stuff

    Have to do this to you :)


    Grey

    Friday, January 23, 2015 5:40 PM
  • You can also post pics to onedrive etc and post a link

    Now I didn't try that because the error message suggested that I need to be vetted for links too. I have a number of screenshots from SysInfo if they would help, otherwise I will try and get something tonight (although I was intending to trash the system and re-install everything - not just because of this issue - and I've also dumped the mirror and parity spaces to test with a simple space, which FWIW does not exhibit the issue).


    Your gaming beast is just over powering the 2gb ram in the server?

    That was one of the options I considered but if that was the case a) I would expect it to take a little longer to do so, not five seconds, b) I would expect it to do so for all volumes, not just for mirror and parity, c) I would expect it to happen from the Windows 8 box too, which my benchmarking suggests is actually pushing the other end slightly harder than the gaming rig (the latter is nearly four years old whilst the former just passed its second birthday and other than not having a video card is just as, if not more, capable) and d) I would expect Process Explorer or xperf to detect it.


    can you uninstall AVG to test?
    {sigh} I guess so. I have tried disabling it and it made no difference. I'll see if I can get up before anyone else in the household tomorrow and disconnect the 'Super Hub' (router) from the outside world and give it a shot.


    Thanks for the link to Settings - I'd missed that entry on the drop down :( Unfortunately my signature isn't showing, at least not to me, so...



    • Edited by CessPitts Friday, January 23, 2015 6:11 PM Added note about lack of sig. display
    Friday, January 23, 2015 6:08 PM
  • Appreciate all your effort(s) here.  None of us have any of these boxes yet to play with.  Just not sure how it will shake out.  Would like to see your errors if you get that going

    Grey

    Friday, January 23, 2015 6:34 PM
  • Plain old resource monitor on both machines showing ram and network?

    Hopefully I'll be allowed to post links, so here is the storage space definition (since I had to redo it):

    [I tried the image insert, but got the same error message about needing to be verified, so just the text of the links instead]

    http://www.pitts-is.me.uk/images/thecus_create_parity_volume_confirmation.png

    NOTE that subsequent investigations suggest that this issue may not occur with NTFS, only with ReFS. I will investigate further once I've rebuilt the OS.

    Here is the RESMON output from the Thecus box, taken within a second or two of the failure message appearing on the Windows 7 box (Cryo):

    http://www.pitts-is.me.uk/images/thecus_20150123_push_from_cryo_to_parity_space_resmon_overview_after_fail.png

    and here is the RESMON from the Cryo, taken a few seconds later (a bit of fumbling and the pressing of Alt-Pause instead of Alt-PrtScn in between):

    http://www.pitts-is.me.uk/images/cryo_20150123_push_from_cryo_to_parity_space_resmon_overview_after_fail.png

    HTH



    • Edited by CessPitts Friday, January 23, 2015 10:49 PM Added note about NTFS not failing
    Friday, January 23, 2015 10:16 PM
  • LOL, I was just getting ready to ask if you could try with NTFS instead of ReFS.  I setup the SKU in a VM with 2gb ram and created 4 disks and made a pool like yours with parity and ReFS.  But it seems to be working.  CPU and disk are maxed out.

    I do not know what drive subsystem the Thecus has.  Not sure what to tell you


    Grey

    Sunday, January 25, 2015 2:14 AM
  • try with NTFS instead of ReFS

    It is my intention to set up six storage space volumes next time, using both file systems and all three resiliency types, and then repeat the benchmarking exercise against all of them – although current estimates put that at four to six hours elapsed for each external test pass. 


    it seems to be working.  CPU and disk are maxed out

    The CPU seems to get hit hard for the bigger files – the 10 x 2GB file copies see the CPU running close to 100% for the duration, even on a simple non-storage space volume, whilst for the 10240 x 10KB files it chugs along about 50-60% – but I see no evidence of Windows 2012 not being able to cope with that (as you commented earlier, it will be slow but that it is to be expected). I'm not surprised that it worked for you – after all it works for me from one source machine too. Once I've managed to recreate the error with the rebuilt system then I will expand my testing to include three other systems (a laptop running Win 7 Pro, my son's gaming rig running Win 7 Home and an old desktop running Windows 8) although all are talking via Powerline adapters which may alter the parameters sufficiently for the problem not to occur.

    Unfortunately I've not yet had a chance to test without AVG on the Windows 7 box because the rebuild did not go entirely smoothly (I need to have words with Thecus about making it impossible to boot the system from the front panel USB connector – the lack of useful documentation on this box is disgraceful) and I've decided to try some benchmarking with non-storage space volumes before recreating the storage pool approach


    I do not know what drive subsystem the Thecus has

    Nor do I, thanks to the grotty documentation. A quick dig around in Device Manager suggests that it might be an Intel ICH10 or ICH10R (if that is the level you were looking for). Can I do anything specific to cast light on that??


    Not sure what to tell you

    Well thanks for trying ;) The thing that frustrates me most about this is that I cannot find any way of divining which end of the connection is generating the error. Thus far I've concentrated my efforts on the Thecus, since it is the newbie in the estate, but the problem may well lie with the Windows 7 box.

    I've also been a little disappointed with a couple of aspects of this, viz. the fact that Essentials sets up a .local domain by default, which seems to be contrary to the current best practice and is why I went for the rebuild so that I could cancel the initial configuration and restart it using the correct Powershell incantation to get a proper domain name, and the fact that all of my testing to date suggests that a parity storage space on this box performs three times worse for writes than a simple storage space. Is that normal or is that a function of the rock bottom processor and RAM (or would you rather I created separate threads for those questions, since they don't relate to the original issue, although the latter might be connected)??



    • Edited by CessPitts Sunday, January 25, 2015 1:25 PM
    Sunday, January 25, 2015 1:21 PM
  • dot local is not a deal breaker is it :)

    If you have not seen this you might like it

    https://software.intel.com/en-us/articles/intel-nas-performance-toolkit

    I am not smart enough to dig too deep so yes Intel ICH10 was what I was pondering. I do think Parity takes some overhead, but I think we agree that should mean slow, not an error.

    Again, I do not have a Thecus, but rebuild should be along the lines of hold down shift in windows and press power>restart.  A menu should come up advanced>troubleshoot> restore the factory image.

    If you have a usb nic lying around wonder if the beast would do different?


    Grey

    Sunday, January 25, 2015 1:52 PM
  • dot local is not a deal breaker is it :)

    Yes and no. I understand what you're saying, but 37 years of working with IBM mainframes has taught me that it is always better to take the time and do it right than to have to try and tidy up later. The reasons for not using .local seem compelling enough to me (cf. http://www.mdmarra.com/2012/11/why-you-shouldnt-use-local-in-your.html) to get it right first time.

    I have played with Intel's NASPT in the past, when first testing my existing D-Link DNS323 NAS, but went back to using my home-grown method as I felt it made it easier to collate results from multiple sources. I might give it another try.


    rebuild should be along the lines of hold down shift in windows and press power>restart. 

    I have to admit that that is a new one on me, but since this was one of the few areas that is documented by Thecus (see page 19 of the Installation and Recovery guide here: http://ftp.thecus.com/wss/) I followed that guide to the letter.


    If you have a usb nic lying around wonder if the beast would do different?

    Interesting idea, but no dice as I don't have such a beast amongst my collection of gadgets

    Sunday, January 25, 2015 4:47 PM
  • Wondering about the pagefile on the Thecus and starting the beast in safe mode with networking

    Grey

    Sunday, January 25, 2015 4:50 PM
  • On http://wss.thecus.com/sp_download.php  under w5000 utilities.  What is this?

    [Release Note]
    1. Improve system utilization.
    2. We could check the LCMDISPLAY version from "Control Panel\Programs\Programs and Features".
    [New Features/Changes]
    1. Improve system utilization.


    Grey

    Sunday, January 25, 2015 5:23 PM
  • Wondering about the pagefile on the Thecus

    Currently system managed and according to Process Explorer virtual memory usage is not an issue


    starting the beast in safe mode with networking

    Another thing to try when I've reproduced the issue


    What is this?

    The W5000 has an LCD display on the front of the unit and this software is used to drive the information displayed on that screen. I'd installed that update on my first build but had forgotten all about it on the rebuild. Having said that the LCMDisplay had used just five seconds of CPU time in just under 24 hours of uptime and even the related routine, which from its name controls the LED lights on the unit, had only used a few minutes of CPU in that time - nothing compared to the three WMI Provider hosts which have used an hour and a half of CPU between them. I've applied the update regardless and we'll see if it makes any difference.

    More pertinent to me is the amount of CPU usage simply copying files to the NAS box eats up. Both NTFS and ReFS simple volumes (no storage spaces) peg the CPU at pretty much 100% when copying anything other than the small file workload. This seems excessive and may explain why the mirror and parity spaces don't perform too well, since they've got a lot of work to do on top of that. If I copy between the internal disk and the simple volumes I don't see anything like that CPU usage, just around 15% or so. My reading on the subject suggests that this is because there is encryption/signing going on when sending data to Windows 2012, which may not be the case for a Linux-based NAS and which is probably overkill for this environment.

    Sunday, January 25, 2015 8:52 PM
  • It said Improve system utilization, so I dent know if that was overall or just the LCD..

    Well you could go along these lines...

    https://exchangemaster.wordpress.com/2010/05/12/disable-toe-and-rss/

    More better

    http://blogs.technet.com/b/onthewire/archive/2014/01/21/tcp-offloading-chimney-amp-rss-what-is-it-and-should-i-disable-it.aspx


    Grey


    • Edited by Still Grey Sunday, January 25, 2015 9:39 PM
    Sunday, January 25, 2015 9:32 PM
  • No real update thus far, but last night I cleared down the Thecus data volumes, removing the non-storage space disks (although I suspect that under the covers it doesn't make a lot of difference, the system just added the disks to a 'Primordial' storage pool), defined a single storage pool spanning the three 3TB WD Red disks, created three storage spaces - one for each resiliency type - thin provisioned with 3TB each, and added two volumes to each - one NTFS and one ReFS, with no deduplication in the case of the former.

    I had previously turned off SMB signing and encryption on the server via Group Policy (not recommended for a business environment, obviously, but in my case I felt that the CPU cycles were more valuable than the security of the traffic), as per advice I'd read in a couple of different Windows Home Server related forums (and using the description of how to do it here: http://mctexpert.blogspot.co.uk/2011/02/disable-smb-signing.html). This greatly improved the CPU usage on the Thecus and consequently improved disk copy times, both reading and writing, by anything from 15% to 85% (with a couple of strange anomalies, involving writing to the internal SSHD, where they actually went up - albeit I only ran a single test from each box). One other anomaly that I noticed, and which I'll be investigating further, was that CPU when copying from the Windows 7 box was running around 50% whereas from the Windows 8.1 system it was more like 25%, which suggests some significant difference between the two.

    Finally, and the point of this message, I left a full set of copies (write to and read from each of the six volumes, with the same three test directories and data as before) running overnight on the Windows 7 rig and the only ones that failed were writing to the mirror and parity ReFS volumes. with the latter apparently exhibiting the problem to a greater degree since 19 of the real world workload files failed as well as all ten of the 2GB ones. 

    Having read all of the advice around tunnelling, chimneys etc. {sigh} I think I may be even more confused than before - surely turning off the offload on the server is going to make CPU usage worse not better (assuming that the network card is handling the offload properly)?? Either way I'm going to try disabling 'Large Send Offload v2 (IPv4)' on the Realtek NIC in the Windows 7 box (not in the least because I've seen at least one other person reporting success from doing so using the same chipset) and running some tests. Others to follow now that I can recreate the issue.

    Tuesday, January 27, 2015 7:47 AM
  • Just so I am not even more confused.  When you say parity and mirror, you mean 2 distinct separate pools?  I did not see a way to mirror with parity for one pool.

    Yes, MS thinks offloading is great but the depend on the NIC vendors.  So I suppose it could be on the win7 side.

    You are worse than me spending time on a problem :)


    Grey

    Tuesday, January 27, 2015 3:26 PM
  • When you say parity and mirror, you mean 2 distinct separate pools?

    Indeed I do, as per the screenshots that (hopefully) follow below. Actually, no they don't because I'm apparently still not trusted with blunt objects, so it is just the URI text and I'm afraid you'll have to copy and paste :(


    You are worse than me spending time on a problem :)

    I need to fix this otherwise this new purchase was a waste of several hundred quid, unless I can convince the vendor to take it back as not fit for purpose.

    Server Volumes view: http://www.pitts-is.me.uk/images/thecus_volumes_view_20150127.png

    Server Storage Pools view: http://www.pitts-is.me.uk/images/thecus_storage_pools_view_20150127.png

    Server Disks view, showing the mirror volumes: http://www.pitts-is.me.uk/images/thecus_disks_view_for_mirror_20150127.png

    Tuesday, January 27, 2015 6:17 PM
  • I am not sold on THIN provisioning if you can't add more drives?  But should not matter of course.

    Has nothing to do with this, but could you make a screenshot of diskmgmt.msc for me?  Just would like to see how they partitioned the OS drive.  Is it 32gb?


    Grey

    Tuesday, January 27, 2015 8:06 PM
  • I am not sold on THIN provisioning

    I'm not intending to use it for real, but whilst creating this test setup it was easier than working out exactly what proportion of 3x3TB was available for me to allocate to each storage space, given that the drive manufacturers are, by convention, allowed to lie about how big their drives are.


    could you make a screenshot of diskmgmt.msc for me?

    Does the Volumes display above not give you the same information??


    Just would like to see how they partitioned the OS drive.  Is it 32gb?

    As delivered, yes there was a 32GB OS partition and nothing else defined. I extended that to 90GB+ and then partitioned the remainder of the SSHD as an NTFS volume.

    To return to the original issue I've tried a variety of things this evening, all without success. I tried disabling the TCP Offload settings on the Windows 7 machine NIC and on the Thecus active NIC (it has two, only one of which is cabled) and both together. No dice. I tried turning off RSS on the Thecus. I tried increasing IRPStackSize to 18 on the Windows 7 system (because devicetree - http://www.osronline.com/article.cfm?article=97 - told me that I had a couple of devices with stack sizes of 16 or 17). I tried booting the Windows 7 system into Safe Mode with Networking. All attempts fail in pretty much exactly the same way - a brief period of Calculating the various metrics and then the 0x800705AA error dialog within five seconds or so of starting the first file copy.

    Now I'll be honest and admit that I wasn't expecting any of those troubleshooting steps to work - because I would have expected failures related to them to manifest themselves for all copies not just the two multi-write ReFS ones, but they had to be eliminated.

    However, I am now pretty much out of ideas. I will try stripping AVG IS 2014 off the Windows 7 box first thing tomorrow morning, but if that makes no difference then I really don't know where to go next :(


    Cheers, Steve

    Tuesday, January 27, 2015 10:57 PM
  • Since you have resized the 32 I will not see that, but I was curious if it had 2 baby partitions or one in front of the OS and if it had a recovery partition at all.

    Try a diff nic?  I know nothing about ReFS, though I remember 10 years ago the Home Server team was talking about self healing drives.  I guess this is it.

    http://blogs.msdn.com/b/b8/archive/2012/01/16/building-the-next-generation-file-system-for-windows-refs.aspx

    http://windowsitpro.com/windows-server-2012/q-what-features-does-ntfs-support-refs-does-not-support

    Are you copying the same files from the win 8 as the Win 7.  You think if you had any files on win7 with short file names it would cause this?


    Grey

    Tuesday, January 27, 2015 11:45 PM
  • I was curious if it had 2 baby partitions or one in front of the OS and if it had a recovery partition at all

    Just to satisfy your curiousity, herewith a full disk management view as things stand today: http://www.pitts-is.me.uk/images/thecus_disk_management_all_20150128.png

    According to that view there is a 100MB partition on the front of the disk, but there isn't anything in it.


    Try a diff nic?

    I doubt that I can install a separate card, but since there is a second Ethernet port then I guess it would be easy enough to test the other one. Nothing in the documentation told me which one to use anyway, but I made an educated guess based on the manual for the N5550, which appears to use the same hardware. 


    I know nothing about ReFS
     It is supposed to eliminate bit rot, and thereby remove the need for CHKDSK et al. Outside of this issue all of my testing so far would suggest that there is little difference in direct copy performance between the two file systems, but there must be additional background checking going on for ReFS partitions.


    Are you copying the same files from the win 8 as the Win 7

    Physically, no obviously not. Logically yes, they all started from the same seed directories (on a PC that is now being used by my older daughter).


    You think if you had any files on win7 with short file names it would cause this?

    No risk of that with the structure I've been using to test from Explorer, viz:

    >dir d:\test\out\Big_20GB\
    
     Volume in drive D is Nemesis_Data   Serial number is 943f:d398
     Directory of  D:\test\out\Big_20GB\*
    
    21/04/2011  14:52         <dir>    .
    21/04/2011  14:52         <dir>    ..
    08/03/2011  22:52   2,146,566,144  big_file.1
    08/03/2011  22:52   2,146,566,144  big_file.2
    08/03/2011  22:52   2,146,566,144  big_file.3
    08/03/2011  22:52   2,146,566,144  big_file.4
    08/03/2011  22:52   2,146,566,144  big_file.5
    08/03/2011  22:52   2,146,566,144  big_file.6
    08/03/2011  22:52   2,146,566,144  big_file.7
    08/03/2011  22:52   2,146,566,144  big_file.8
    08/03/2011  22:52   2,146,566,144  big_file.9
    08/03/2011  22:52   2,146,566,144  big_file.10
        21,465,661,440 bytes in 10 files and 2 dirs    21,465,661,440 bytes allocated
       443,775,528,960 bytes free

    Further testing this morning involved removing AVG IS 2014 and also the Microsoft Network Monitoring tools (I no longer remember why I even installed those) but sadly neither made any difference.

    To be honest I feel like I'm getting to the level of just trying things at random. None of the testing so far has even confirmed which end of the connection is the problem and I don't know how to pursue this further, other than perhaps raising a formal support request with Microsoft (which I may well not be entitled to do for free because this is an OEM product). Anything else I can try to nail down where this problem resides??


    Cheers, Steve

    Wednesday, January 28, 2015 9:17 AM
  • Thanks for your continued efforts and feedback.  The pic is what I wanted.  Just never seen a uEFI box with one small partition.  You spend your time testing file copies, where I devote most of my time to backup and restore.  I really do not need another box here but may just have to break down and get one.

    I meant a different nic on the beast (win 7).  All the MS offloading depends on proper nic behavior. I have no clue if safe mode with networking disables all the "extra" stuff in the NIC.

    Microsoft feels like they are missing market share for the NAS market, so they came up with this new SKU at a very low cost to "select" OEM's.  Because it is really Server 2012R2 under the hood, a ~700$ OS, they have to put some limits on it to keep the lawyers happy. 2gb ram, OS loaded on a drive, limit the # of Hard Drives, and a low price point for example.  But none of these select OEM's have given anything to the MVP's that I am aware of.  Even my buddies at wegotserved do not seem to have done any hands on reviews and they get "everything."

    You know as I sit here typing this, I ponder if perhaps the algorithms in the Seagate SSHD do not know ReFS?  I could test that cheaper than getting a Thecus box if I wanted a Seagate drive  <grin>  But no, it would fail on the WIn 8 if that was it.

    I will ask my contacts at MS to take a look at this thread, but they stay so busy with v.next I don't know if they will spend many cycles on it.


    Grey

    Wednesday, January 28, 2015 1:01 PM
  • Ff it works from the win 8 box, it must be in the win 7 box?

    Grey

    Wednesday, January 28, 2015 1:58 PM
  • Ff it works from the win 8 box, it must be in the win 7 box?

    I'm going to answer this one first because much of the rest of this is not going to be pertinent to the problem at hand. I've been over and over this aspect whilst trying to think this issue through and you are right, except that it only happens when copying files to the Thecus and only then when the target is a ReFS partition on a mirror or parity storage space. So the best I can come up with is that it is most likely an issue on the Win7 box that is triggered by something that is happening on the server side, but even that is a bit of a stretch. This is why the lack of information from the error message bugs me so much - in order to debug a problem like this you need to know what resource has been exhausted and in which part of the software stack.

    Now that may not be easy to do in a generic way, and since programmers are inherently lazy it is tempting just to return a simple error value and be done with it. However, I've been in the position of doing just that in a commercial product and ended up having to go back and improve the error information when that particular message/code was tripped and I was expected to debug the problem! Obviously there is a significant difference between a Microsoft consumer product and a mainframe product that costs many times as much and comes with a built in maintenance fee, but the underlying requirement is the same - somebody needs to be able to solve the problem using the information returned. In this case that simply isn't possible.


    You spend your time testing file copies, where I devote most of my time to backup and restore

    I don't really want to be testing file copies - the initial intention was to benchmark the different storage space and file system combinations that I was intending to use but the error whilst doing so has spiralled into a cycle of testing and tweaking that really isn't achieving anything. My primary reason for having a NAS at all has always been backup. My current strategy for the two boxes participating in this testing involves having a local drive/partition to hold backups, running a daily incremental file copy to that partition which is then immediately copied to a NAS and backing that up with a regular (needs to be at least once per month to be totally secure) full image copy of the local disks that is also copied to the NAS afterwards (hence my fascination with copying large files).

    There is a weakness in that strategy because I've never been very good at performing that full image backup regularly enough, so one of the reasons for buying the W5000 was the possibility of making those backups automatic and driven from the server end. However, that takes the local backup drives out of the equation and leaves me with the need to backup the NAS, which I don't do with my existing unit because there are (nearly) always copies held elsewhere.

    The other reasons for going with the Thecus were a desire to backup the other machines in the household - I've always dreaded a hard drive failure on my wife's laptop but getting her to perform any kind of housekeeping is nigh on impossible and also to provide a file server capability protected by a single set of userids (the existing NAS data is open to all household members). So my goal is backup and restore too ;)


    I meant a different nic on the beast (win 7)

    I should have realised that but obviously wasn't thinking straight. I don't have a spare gigabit NIC to hand (although perhaps even a megabit one might provide an interesting data point) although there is such a card in one of my other (less used) PCs that I could cannibalise for testing purposes. Another project for the coming weekend methinks.


    put some limits on it to keep the lawyers happy. 2gb ram, OS loaded on a drive, limit the # of Hard Drives

    That statement got me thinking, because I've never been able to find a definition anywhere of what the restrictions are with WSS 2012 R2 Essential - if I bring up the software license terms on the box itself they are for 2012 Standard!?- and wonder whether they'd stop me doing things like adding RAM or changing the processor.


    Even my buddies at wegotserved do not seem to have done any hands on reviews and they get "everything."

    The cynic in me wonders whether that is because Thecus know that they've just shovelled this onto a handful of existing boxes that barely meet the spec. and which simply aren't up to snuff as anything other than a box full of disks.  The Thecus boxes look like good value because they include the server OS (the unit cost me roughly 50% more than I could buy Windows Server 2012 R2 Essentials for) but if you can't realise that value then they are just an expensive NAS. 


    if perhaps the algorithms in the Seagate SSHD do not know ReFS?

    I haven't put a ReFS partition on the SSHD, only on the three 3TB WD Reds.


    I will ask my contacts at MS to take a look at this thread, but they stay so busy with v.next I don't know if they will spend many cycles on it

    Perhaps you could ask them if the next version of the OS could do a better job of identifying which resources have been exhausted, by what part of the stack and where in the maze of connectivity that makes up a modern computing environment?? {gd&r}


    Cheers, Steve

    Thursday, January 29, 2015 9:08 AM
  • LOL, don't get me started on error msgs.  Plus I figured out the root of all evil in 1997.  Take what you have now.  Server 2012R2.  So it was designed/written, what 6 years ago?  Whoever wrote it is either elbows deep into something else now, working at Google, or on a kayak at Victoria falls. Point being it is hard to find who could actually help without spending tons of money on it.  MS loses money fixing old stuff that is already paid for.  They make money selling us new stuff.  If we find a bug, the first goal is to make sure it does no happen in v.next.  Then if it is an easy fix, they may take time to back port it.

    Backups:  YEA !! You have exactly the product you need. You run the connector to connect your wife's lappy and it gets backed up every day. It is a fancy full image, single instance storage backup nightly.  You need a file or folder, start the restore wizard and pick a date from the calendar.  Hard drive dies, (crypto locker) you can either create a boot thumb drive from the dashboard wizard, or if your wife's lappy has network/PXE boot just press F12/whatever.  It will see the Thecus and boot the restore wizard. Again, pick the date from the calendar, click next and go watch TV.

    So it does a full backup every day, but the SIS or data de-dupe means you hard drive(s) do not fill up. If it does not have a cluster from your data it will add that cluster to the database. Otherwise it just puts a checkmark, yes we have that. Now when you backup the beast and other machines, it only adds the clusters it does not have. For example say you both have a different picture of your backyard. The cluster(s) for blue sky may be the same, so it only needs to store one copy in the backup database. By the same token if both boxes had the versions of Office, it only needs one copy to restore one or 25 machines.

    Now that you have all these valuable system backups on the Thecus, you use Windows Server Backup to backup the Thecus to a local USB drive that you can take offsite or to an ISCSI target on another box in your network.  Similar restore process.   For files or folders start the wizard from the dashboard, for total failure start by booting from a thumb drive. Pick a date from the calendar and click next.

    So every day you get a full backup of everything automagicly.  Do not have to remember to do anything or move large files across the LAN.

    Adding RAM.  No clue, did you read the EULA before you clicked next?

    No data on the SSHD.  Again I am amazed any of this stuff works. But when the box thinks it is getting that from the ssd?  I know, a longshot LOL

    The specs are set by MS.  Don't blame Thecus just because they were first to market. The box may not be suited for large file REFS transfers.  But it does have a great backup system, storage pools, remote web access and is a ton of value for the money.  Plus it is a Windows OS which has large advantages in a Windows network.  As well Server 2012R2 is the latest and greatest OS MS is shipping today.  It still amazes me that the Server Solution team can take an expensive Server OS, preconfigure a lot of stuff for us, add wizards to make it easier, add a lot of value to the product, and sell it for less than the underlying SKU.

    Oh, I forgot to mention, the capability is available to backup some of the stuff to Azure if you like.  I don't like, but YMMV


    Grey

    Thursday, January 29, 2015 2:08 PM
  • http://www.mdmarra.com/2012/11/why-you-shouldnt-use-local-in-your.html

    So I have been running a .local since about 1999 with no ill effects.  Since this NAS unit is geared towards small deployments, I don't see how any of that post really gives me a good argument against .local.

    Just because you set up a domain for a large/externally facing AD deployment one way isn't good enough justification to say that .local won't work in a small setup like this.

    Can someone give me a better justification other than "this isn't how we do it in large AD" ...which this isn't?

    Thursday, January 29, 2015 4:50 PM
  • Aw come on Susan.  We have bigger issues here than dot local :)  I can split that to another thread if you like.

    Grey

    Thursday, January 29, 2015 7:30 PM
  • We have bigger issues here

    We certainly do. In an effort to try and pin this to the Windows 7 gaming rig I thought I'd try hooking up my work laptop (a Lenovo W530) to the LAN (wired) and copying a 2GB file to the Thecus. Lo and behold fifteen seconds of setup and my old friend 0x800705AA (I don't even need to look it up now, it is ingrained in my brain). The hardware and the software stack on the laptop couldn't be much more different from the desktop box. Both are running i7s, but the laptop has a 3820QM, 16GB of RAM, runs Windows 7 Enterprise, and has an Intel NIC. Next step therefore is going to be to try other devices in the house - my wife's laptop also runs Win 7 Pro and my son's gaming rig 7 Home - even if they are using Powerline. I can also hook my wife's laptop to the spare wire that the work lappie currently sits on. More tomorrow, but the finger now points more firmly at the Thecus and 2012 in my eyes


    it does a full backup every day, but the SIS or data de-dupe means you hard drive(s) do not fill up

    Which sounds good, but prompts two questions. Firstly, isn't a full backup of a machine going to flood the network and how does it cope with backing up seven systems?? Secondly, is that de-dupe separate from the deduplication built in to Server 2012, which it seems only works with NTFS and not ReFS??


    you use Windows Server Backup to backup the Thecus to a local USB drive that you can take offsite

    Which is the bit that is giving me kittens, because I don't see how I can supply enough disk space to do that once I've got all seven Windows client devices connected and backing up, dedupe or no (short of buying another NAS with another 9TB of HD).


    did you read the EULA before you clicked next

    As I said above, the license agreement is the 2012 R2 Standard one. I cannot find any information on WSS 2012 R2 Essentials licensing anywhere, Microsoft site or otherwise.


    No data on the SSHD.  Again I am amazed any of this stuff works. But when the box thinks it is getting that from the ssd?

    Sorry, I really don't understand those sentences and suspect that one or other of us has missed a point somewhere, probably me.


    the capability is available to backup some of the stuff to Azure if you like

    Too rich for my blood. That kind of cost might make sense for a business with DR requirements, but I can't see the family budget stretching to terabytes of cloud storage.


    Cheers, Steve

    Thursday, January 29, 2015 10:03 PM
  • I agree on Azure, but they call it a strong point.

    Which sounds good, but prompts two questions. Firstly, isn't a full backup of a machine going to flood the network and how does it cope with backing up seven systems?? Secondly, is that de-dupe separate from the deduplication built in to Server 2012, which it seems only works with NTFS and not ReFS??

    Yes the backup Single Instance Storage is for the backup database has nothing todo with de-dupe in server. You set a backup window.  That box probably defaults to something like 8pn to 6am.  It backs up whatever machine it wants to first.  May or may not be the first one you connected if you connect several at a time. The first machine does take a fair bit of time. The next time that machine is backed up it will only take minutes, unless you have added tons of new different data. So once you get one machine backed up it will start on the next one.  It does not drag all the data across from here on.  It just looks at what clusters it does not have and adds them to the backup database. It really really works.  If you had 10 win 7 boxes the database would only be slightly larger than one machine. (oh, so say you have 10 BIG machines.  It might actually take a week to get one backup on each, then all 10 would backup nightly.  Folks do this with 25 machines no problems)

    Now of course it is up to you how and where you put your data. So for example if you just left all your data on your PC's, it gets backed up every night to the server, you really do not need a 3rd backup to something, unless you want to take that offsite.

    If you put all your data on the server to have a central place to store and share, then 2 things happen.  Your actual PC backup database will be very small, but you do need to backup the server to have a backup.  In my case, I do not have a ton of data that changes so I could for example put all recorded TV before 2015 in one partition or folder even and copy that to a dedicated hard drive to take offsite.  Then I tell my nightly backup to backup everything except that stuff before 2015.  I can pop a 4tb drive in a USB dock for fairly cheap

    But yes, if you want to backup 9tb every night you will need at least 9tb of space to put that backup.  And another NAS onsite is not offsite.  But if you can get your nightly backup down to say 3tb and the floods come, you can just grab that 3tb drive on the way out of the house.  With that 3tb drive plus the other 4tb drive you had offsite you can recreate your entire network.  The server, it's data, and your 7 other PC's

    And for giggles, this is just my view and few share it.  And I guess it depends on how much data you have.  But my vote is to spend money on backup first, then raid.  As in say purchase two 3tb drives in raid 0 and backup to one 6tb drive.  You loose capacity with mirror/parity.  Now of course if this is a mission critical business call center and uptime is vital, then of course the drive redundancy is a good idea.  But you have to have backups.

    And a note on the server backup.  Say you backup 2tb to a 2tb drive, you only have room for "one" copy.  The more extra free space you have on the drive, the farther back in time you can go on the calendar.  Windows Server Backup, will fill up the disc, then start replacing the oldest stuff with the newer stuff.

    So in a nutshell, I don't think you need ReFS to protect your data


    Grey

    Thursday, January 29, 2015 10:46 PM
  • An update on the (lack of) progress on this. Over the weekend I tried a handful of potential fixes, all without success. These included changing the LAN connection to the box (using a different cable, connected to the 8 port switch rather than to one of the gigabit ports on the router, the execrable Virgin Media Super Hub), disabling autotuninglevel on both ends and disabling QoS Packet Scheduler on both ends. I also ran tests from every other machine currently in the house, but I cannot reproduce the issue when the network connection involves Powerline and it turns out that my wife's laptop has a megabit NIC (who knew!!) and that doesn't exhibit the problem even when wired (although of course it might not even over a gigabit connection). I didn't find the time to add a different NIC, megabit or gigabit, to the Windows 7 gaming box, but I may yet get to that.

    At the moment my only medium to long term solution seems to involve 'don't do that', which I hate as a technical 'solution'. It does worry me that if this is file size related that it may manifest itself later with files larger than 2GB, but I've done some limited testing with a ~20GB file without any additional problems. Since an extra stick of memory for the box was only twenty quid, and since adding memory is unlikely to make things worse at the levels we're talking about, I've got another 2GB on the way and we'll see if that has any effect on this issue once it arrives later in the week.


    The next time that machine is backed up it will only take minutes, unless you have added tons of new different data

    Aha. I got the impression from one of your previous posts that it was backing up the whole shebang each time, but obviously (and thankfully) not.


    you really do not need a 3rd backup to something, unless you want to take that offsite

    Two copies of everything is good enough for me. I think that there will be a limited 'shared storage' area but it won't hold much data (certainly small enough to use the server backup to preserve another copy, probably on the old NAS) and the intention is that things like photos are held on one or more of the PCs and backed up to the Thecus.


    my vote is to spend money on backup first, then raid

    I hear you, and that is the approach I've taken to date. It was not my intention with the Thecus, but it would seem that reality has overtaken my intended approach (which was to have the three drives in a RAID 5 style parity setup as an extra level of protection). Since my testing suggests that there is little or no difference between the two file systems in a simple non-storage spaces setup, I think I'm going to plump for ReFS across a flat 9TB, thus maximising the available space whilst sacrificing a little data security.


    I don't think you need ReFS to protect your data

    Maybe not, but I like the idea of baked in protection against (or at least identification of, since the automatic elements disappear without a second copy to recover from) the degradation of data. Without that you often have no idea that things are going bad until it is too late.


    Cheers, Steve




    • Edited by CessPitts Tuesday, February 3, 2015 8:47 AM
    Tuesday, February 3, 2015 8:43 AM
  • sounds like you have it sorted, sorta :)  All sounds great to me.  The only add I have is if you setup a backup for the server OS partition and the small amount of shared data to the old NAS is a great idea.  That way you can restore the Thecus, users, config and all if needed.  Also if on your old NAS if it can be an ISCSI target, you get "better" backups.

    Better as in the server sees ISCSi targets as a local drive and will do incrementals and give you the calendar to pick a restore date. If you just backup to a network share, it simply does a full each time overwriting the old.

    Thanks again for all your input and I hope you will keep us updated on your experiences.


    Grey

    Tuesday, February 3, 2015 1:35 PM
  • sounds like you have it sorted, sorta

    All I've done is bypass the problem, and experience tells me that that may come back to bite me at some point. It would seem that there is a fairly fundamental issue somewhere in the networking layer, probably on the Windows 7 boxes, which only manifests itself when the server end cannot keep up with the network transmissions. Windows 8 handles the situation but Windows 7 does not.

    What leads me to that conclusion is some further testing performed after I'd added an additional 2GB of RAM to the Thecus W5000. This did not eliminate the problem, but subjectively it did seem to take slightly longer to fail. This lead me to monitor what was going on at both ends during both a successful copy (from both PCs) and the unsuccessful one. The Process Explorer Sysinfo summaries below illustrate the differences that I see between copying to an NTFS pariry space (which works from both sources) and to a ReFS parity space (which only works from Windows 8):

    Process Explorer system information summary on the Thecus whilst halfway through pushing 10 x 2GB files to a ReFS parity storage space from a Windows 8 boxProcess Explorer system information summary on the Thecus whilst halfway through pushing 10 x 2GB files to an NTFS parity storage space from a Windows 7 box

    As you can see (hopefully) the ReFS parity space results in significant pauses in the network traffic, in fact there is more dead space than transmission time, and it is after the first of these that the Windows 7 system coughs:

    Process Explorer system information summary on the Thecus just after failing to push a 2GB file to a ReFS parity storage space from a Windows 7 box

    FWIW I see the same picture on the client side, but forgot to save the screenshot of the failing copy. I do have the Sysinfo summary displays associated with the successful copies but I don't see that they add any useful information.

    Anyway, the end result of all this toing and froing is that we've never fixed the underlying issue but I've decided that I had to move on and ended up installing StableBit's DrivePool and using that instead of storage spaces of any format. For me this has the great advantage that I can duplicate some data but since I can control that at a directory level there only needs to be as much extra space used as necessary for that data (rather than duplicating everything or nothing, which were my options with the Windows native solutions).

    if on your old NAS if it can be an ISCSI target, you get "better" backups

    Sadly not, at least not with the D-Link firmware et al – in fact I cannot find any way to get the server backup to write to the old NAS over the network, even with a drive letter mapped it insists on a local drive. Fortunately I have an even older 500GB Buffalo Linkstation that can be USB attached so I've gone with that for now. It may prove not to be big enough though, at which point I may have to try and hack the D-Link DNS323 to be an iSCSI target. 

    I hope you will keep us updated on your experiences

    Thus far they are very mixed – I apparently have both server and client computer backups running (and gently eating network bandwidth – that aspect seems to work nicely, just trickling along in the background) but I cannot even get started on restoring individual files (for which I'll open another thread).


    Cheers, Steve


    • Edited by CessPitts Friday, February 13, 2015 8:53 AM Corrected typo and added summary paragraph
    Friday, February 13, 2015 8:47 AM
  • Just look at you.  Posting pictures and everything ! Maybe they dropped network backups.  Have not tired it in a long while.  With the drive mapped with a letter, in the wizard that shows destinations', there is a checkbox at the bottom that shows "all disks" ?

    Again, thanks for all your feedback.


    Grey

    Friday, February 13, 2015 12:35 PM
  • Posting pictures and everything

    It would have been that way from the get go, if only I'd been permitted so to do.


    Maybe they dropped network backups

    Only from the 'friendly' Essentials front-end. It is possible, as it turns out, to set up network backup destinations using the wbadmin.msc console. The more I use the Essentials front end the more I'm looking for the full fat ways of doing things. Too much of a techie, I guess.

    BTW I tripped over an interesting Technet Blog page that I wish I'd found whilst having the original large file copy problem since it explains a little more about how this is working under the covers: http://blogs.technet.com/b/perfguru/archive/2009/02/19/file-copy-fails-with-the-error-insufficient-system-resources.aspx


    Cheers, Steve


    • Edited by CessPitts Monday, February 16, 2015 10:05 PM Added note about PERFGURU blog entry
    Monday, February 16, 2015 9:56 PM
  • Well the dashboard is very good for noob's.  If you know how to map a drive letter, you can use the windows server backup console :)

    Grey

    Monday, February 16, 2015 10:05 PM
  • Not sure if you have resolved the issue completely however Server 2012 R2 (all variants) comes with Receive Side Scaling  for the nics enabled (it is disabled on Windows 2012 by default). Usually it shouldn't be an issue however the Atom CPU is so weak that you might want to disable it.

    Receive side scaling (RSS) is a network driver technology that enables the efficient distribution of network receive processing across multiple CPUs in multiprocessor systems.

    I've seen some cases where errors where coming up because of this so you might want to give it a try.

    Sunday, August 14, 2016 2:51 AM