locked
How to deploy ATA in complex / distributed / mixed environments RRS feed

  • Question

  • Hello,

    I initially made an answer in Weston's thread (https://social.technet.microsoft.com/Forums/security/en-US/99716d50-c706-456b-b916-60ffbff19aa6/no-deployment-option-for-a-slightly-larger-hyperv-environment-with-virtual-dcs?forum=mata) but I open this dedicated post to let each one the opportunity to explain the own constraints.

    Please let me know if you want further details about my environment (in that case, we will need to have offline discussion) because I do not want to share too much on a public forum), I am definitely commited to help you because your solution looks very promising and likely to be included as part of our existing contract with MS (based on an answer made on another thread). Also I do see several debug / warning in your logfiles that I would eventually like to have an explanation about.

    As far as I am concerned, I do see several challenges to address. The idea to analyse network traffic from/to each DC rather than only event logs is great but does involve new challenges. Here what I can say so far.

    • I guess (hope ?) this limitation is for preview only but 4 gateways max. that can only handle 4 DC max. each one is definitely not enough. My production environment is made of ~50 DC/RODC, targeted to be extended in the next couple of years to more than ~120 DC/RODC to deal with more than 100 different physical offices. For most of the office, we only deploy 1 DC (or RODC depending on case). For our biggest sites, it can be up to 10 DC
    • Given the expected amount of data to be stored (based on your provided estimation), it looks essential to be able to have a dedicated SQL backend. Now you are part of Microsoft, it would be great to have support of MS SQL ;) but at least the capability to support a remote instance of Mongo DB would be a must
    • On my biggest sites, we do have several DC that are physical. However I am not allowed to deploy ATA gateway on physical servers because physical hardware must remain an exception and can only be requested if justified by performance load. From there, it looks quite challenging to be able "route" DC network traffic from all physical hardware (located on different subnets on same site) to gateways hosted as a VM on our VMware infra (which is distributed across tens of VMware ESx). For this purpose, agent on the DC itself might be an alternative but in that case I do worry about the performance impact. I installed the preview during a week-end after first test in my lab environment and impact has been quite huge (in term of LDAP queries against our AD) during initial import. And according to our monitoring system, I can see ATA GW does periodically perform huge LDAP queries.Likely due to the complexity of our environment (20000+ users, 20000+ users, 50000+ groups) but there is probably area for improvement. From your debug logfile, I can peridiocally see warnings about heavy requests for group membership evaluation.
    • On my smallest sites, the need to have dedicated servers is also a concern but for different reasons. In this use case, DC used to be virtual on small ESX environment so port mirroring requirement should be a concern. But the need to systematically deploy a new server (especially if it has to be in workgroup) might be seen as too much in term of management cost. In my environment today. On the other hand, having the agent on the DC/RODC might be an option than can be discussed as far as LDAP requests performs against the DC remains. Anyway this does apply as well in case of dedicated gateway. Our DC on small sites are not very powerful. Traffic from gateway to ATA main server should also be very optimised. As a worst case, the network link is 2 MB and already very saturated during business hours.
    • Related to my previous comment, there should be away to restrict DC to be used for LDAP queries
    • (nice to have) Support of multiple domains would be welcome
    • (nice to have) I noticed the (automated ?) feature that highlight some item as "sensitive". It would be great to be able manually manage that according to our internal design 
    • light built-in capability to monitor gateway to detect amount of traffic being collected, if it is down, etc.

    Regards.


    • Edited by Sylvain06 Friday, May 15, 2015 3:05 AM
    Friday, May 15, 2015 3:02 AM

All replies

  • Hi Sylvian,

    Thank you very much for the detailed feedback. This is very helpful!

    I do have some questions, as well as some answers:

    1. Regarding the scale limitation - This is definitely on our list of mandatory improvements and we are working hard on increasing the number of supported Domain Controllers so do expect good news on this in the near future.

    2. Regarding the SQL back-end - at the moment we don't have plans to move away from document DBs due to the nature of the data we're handling (network traffic is unstructured data by design as each protocol has it's own layout, so creating a data structure in SQL to accommodate all possible fields in all possible protocols in virtually impossible). Having said that we are planning on supporting remote back-end MongoDB. 

    On the same topic - does your organization have central instances of MongoDB (similar to central SQL) which can be used by ATA? Would you rather see us provide the necessary binaries, deployment, etc'?

    3. Regarding the physical box requirements - we are working on allowing port mirroring of physical VMs into virtual gateways, I can't provide you an ETA on that, but we're looking into this area.

    4. On the heavy LDAP - what impact (in terms of CPU/RAM/latency) have you witnessed on the Domain Controllers? I'll also be very interesting in looking at the debug log file to see the queries you've mentioned. (By our definition a "Heavy LDAP query" is a query which took over 200 milliseconds. Configurable in the GatewayConfiguration.json by the LdapHeavySearchTimeSpan property).

    5. Regarding the small branches - Can you share your definition of small, as this would be very helpful. Specifically I'm interested in the #of users/computers, servers in the site as well as some statistics from the DC (mainly the packets/sec). This will allow us to better understand your scenario.

    6. Restricting LDAP traffic to a specific DC, which is other than the DC being monitoring by the GW, is my understanding correct? Can you please share some insights into the why? (We are mainly querying the DC based on the traffic we see, so the queries in most cases represent users/computers which are using the DC in question for their AuthN/AuthZ processes).

    7. Multi-domain support - already backlogged unfortunately no ETA but on our "short todo" list.

    8. "Manual sensitive accounts" - any specific scenario you had in mind you wanted to use the senstiive accounts for? But it is a very good suggestion and we'll definitely add that to our todo list, just interested in understanding more.

    9. "Monitoring capabilities" - In the oven :-).

    Can you please share your email? (You can use @MichaelDubinsky on twitter for a semi-private message if you don't want to share here).

    Thank you once again for the detailed information and feedback!

    Microsoft ATA Team.

    Monday, May 18, 2015 8:04 AM
  • Hello Michael,

    Thanks for your feedback... I will be very happy to provide answers to your questions. But offline, thanks for your understanding.

    Also I do not want to share my business email on a public forum. I tried your idea to go through Twitter but in order to send DM, it looks you must follow me.

    Regards.

    Sylvain

    Monday, May 18, 2015 9:59 AM
  • @Michael, I just sent an email to our Microsoft TAM kindly asking him to provide you my contact info on my behalf.

    Regards.

    Sylvain

    Monday, May 18, 2015 4:49 PM