High Availability
266 TopicsThe Exchange 2016 Preferred Architecture
The Preferred Architecture (PA) is the Exchange Engineering Team’s best practice recommendation for what we believe is the optimum deployment architecture for Exchange 2016, and one that is very similar to what we deploy in Office 365. While Exchange 2016 offers a wide variety of architectural choices for on-premises deployments, the architecture discussed below is our most scrutinized one ever. While there are other supported deployment architectures, they are not recommended. The PA is designed with several business requirements in mind, such as the requirement that the architecture be able to: Include both high availability within the datacenter, and site resilience between datacenters Support multiple copies of each database, thereby allowing for quick activation Reduce the cost of the messaging infrastructure Increase availability by optimizing around failure domains and reducing complexity The specific prescriptive nature of the PA means of course that not every customer will be able to deploy it (for example, customers without multiple datacenters). And some of our customers have different business requirements or other needs which necessitate a different architecture. If you fall into those categories, and you want to deploy Exchange on-premises, there are still advantages to adhering as closely as possible to the PA, and deviate only where your requirements widely differ. Alternatively, you can consider Office 365 where you can take advantage of the PA without having to deploy or manage servers. The PA removes complexity and redundancy where necessary to drive the architecture to a predictable recovery model: when a failure occurs, another copy of the affected database is activated. The PA is divided into four areas of focus: Namespace design Datacenter design Server design DAG design Namespace Design In the Namespace Planning and Load Balancing Principles articles, I outlined the various configuration choices that are available with Exchange 2016. For the namespace, the choices are to either deploy a bound namespace (having a preference for the users to operate out of a specific datacenter) or an unbound namespace (having the users connect to any datacenter without preference). The recommended approach is to utilize the unbounded model, deploying a single Exchange namespace per client protocol for the site resilient datacenter pair (where each datacenter is assumed to represent its own Active Directory site - see more details on that below). For example: autodiscover.contoso.com For HTTP clients: mail.contoso.com For IMAP clients: imap.contoso.com For SMTP clients: smtp.contoso.com Each Exchange namespace is load balanced across both datacenters in a layer 7 configuration that does not leverage session affinity, resulting in fifty percent of traffic being proxied between datacenters. Traffic is equally distributed across the datacenters in the site resilient pair, via round robin DNS, geo-DNS, or other similar solutions. From our perspective, the simpler solution is the least complex and easier to manage, so our recommendation is to leverage round robin DNS. For the Office Online Server farm, a namespace is deployed per datacenter, with the load balancer utilizing layer 7, maintaining session affinity using cookie based persistence. Figure 1: Namespace Design in the Preferred Architecture In the event that you have multiple site resilient datacenter pairs in your environment, you will need to decide if you want to have a single worldwide namespace, or if you want to control the traffic to each specific datacenter by using regional namespaces. Ultimately your decision depends on your network topology and the associated cost with using an unbound model; for example, if you have datacenters located in North America and Europe, the network link between these regions might not only be costly, but it might also have high latency, which can introduce user pain and operational issues. In that case, it makes sense to deploy a bound model with a separate namespace for each region. However, options like geographical DNS offer you the ability to deploy a single unified namespace, even when you have costly network links; geo-DNS allows you to have your users directed to the closest datacenter based on their client’s IP address. Figure 2: Geo-distributed Unbound Namespace Site Resilient Datacenter Pair Design To achieve a highly available and site resilient architecture, you must have two or more datacenters that are well-connected (ideally, you want a low round-trip network latency, otherwise replication and the client experience are adversely affected). In addition, the datacenters should be connected via redundant network paths supplied by different operating carriers. While we support stretching an Active Directory site across multiple datacenters, for the PA we recommend that each datacenter be its own Active Directory site. There are two reasons: Transport site resilience via Shadow Redundancy and Safety Net can only be achieved when the DAG has members located in more than one Active Directory site. Active Directory has published guidance that states that subnets should be placed in different Active Directory sites when the round trip latency is greater than 10ms between the subnets. Server Design In the PA, all servers are physical servers. Physical hardware is deployed rather than virtualized hardware for two reasons: The servers are scaled to use 80% of resources during the worst-failure mode. Virtualization adds an additional layer of management and complexity, which introduces additional recovery modes that do not add value, particularly since Exchange provides that functionality. Commodity server platforms are used in the PA. Commodity platforms are and include: 2U, dual socket servers (20-24 cores) up to 192GB of memory a battery-backed write cache controller 12 or more large form factor drive bays within the server chassis Additional drive bays can be deployed per-server depending on the number of mailboxes, mailbox size, and the server’s scalability. Each server houses a single RAID1 disk pair for the operating system, Exchange binaries, protocol/client logs, and transport database. The rest of the storage is configured as JBOD, using large capacity 7.2K RPM serially attached SCSI (SAS) disks (while SATA disks are also available, the SAS equivalent provides better IO and a lower annualized failure rate). Each disk that houses an Exchange database is formatted with ReFS (with the integrity feature disabled) and the DAG is configured such that AutoReseed formats the disks with ReFS: Set-DatabaseAvailabilityGroup <DAG> -FileSystem ReFS BitLocker is used to encrypt each disk, thereby providing data encryption at rest and mitigating concerns around data theft or disk replacement. For more information, see Enabling BitLocker on Exchange Servers. To ensure that the capacity and IO of each disk is used as efficiently as possible, four database copies are deployed per-disk. The normal run-time copy layout ensures that there is no more than a single active copy per disk. At least one disk in the disk pool is reserved as a hot spare. AutoReseed is enabled and quickly restores database redundancy after a disk failure by activating the hot spare and initiating database copy reseeds. Database Availability Group Design Within each site resilient datacenter pair you will have one or more DAGs. DAG Configuration As with the namespace model, each DAG within the site resilient datacenter pair operates in an unbound model with active copies distributed equally across all servers in the DAG. This model: Ensures that each DAG member’s full stack of services (client connectivity, replication pipeline, transport, etc.) is being validated during normal operations. Distributes the load across as many servers as possible during a failure scenario, thereby only incrementally increasing resource use across the remaining members within the DAG. Each datacenter is symmetrical, with an equal number of DAG members in each datacenter. This means that each DAG has an even number of servers and uses a witness server for quorum maintenance. The DAG is the fundamental building block in Exchange 2016. With respect to DAG size, a larger DAG provides more redundancy and resources. Within the PA, the goal is to deploy larger DAGs (typically starting out with an eight member DAG and increasing the number of servers as required to meet your requirements). You should only create new DAGs when scalability introduces concerns over the existing database copy layout. DAG Network Design The PA leverages a single, non-teamed network interface for both client connectivity and data replication. A single network interface is all that is needed because ultimately our goal is to achieve a standard recovery model regardless of the failure - whether a server failure occurs or a network failure occurs, the result is the same: a database copy is activated on another server within the DAG. This architectural change simplifies the network stack, and obviates the need to manually eliminate heartbeat cross-talk. Note: While your environment may not use IPv6, IPv6 remains enabled per IPv6 support in Exchange. Witness Server Placement Ultimately, the placement of the witness server determines whether the architecture can provide automatic datacenter failover capabilities or whether it will require a manual activation to enable service in the event of a site failure. If your organization has a third location with a network infrastructure that is isolated from network failures that affect the site resilient datacenter pair in which the DAG is deployed, then the recommendation is to deploy the DAG’s witness server in that third location. This configuration gives the DAG the ability to automatically failover databases to the other datacenter in response to a datacenter-level failure event, regardless of which datacenter has the outage. If your organization does not have a third location, consider placing the witness in Azure; alternatively, place the witness server in one of the datacenters within the site resilient datacenter pair. If you have multiple DAGs within the site resilient datacenter pair, then place the witness server for all DAGs in the same datacenter (typically the datacenter where the majority of the users are physically located). Also, make sure the Primary Active Manager (PAM) for each DAG is also located in the same datacenter. Data Resiliency Data resiliency is achieved by deploying multiple database copies. In the PA, database copies are distributed across the site resilient datacenter pair, thereby ensuring that mailbox data is protected from software, hardware and even datacenter failures. Each database has four copies, with two copies in each datacenter, which means at a minimum, the PA requires four servers. Out of these four copies, three of them are configured as highly available. The fourth copy (the copy with the highest Activation Preference number) is configured as a lagged database copy. Due to the server design, each copy of a database is isolated from its other copies, thereby reducing failure domains and increasing the overall availability of the solution as discussed in DAG: Beyond the “A”. The purpose of the lagged database copy is to provide a recovery mechanism for the rare event of system-wide, catastrophic logical corruption. It is not intended for individual mailbox recovery or mailbox item recovery. The lagged database copy is configured with a seven day ReplayLagTime. In addition, the Replay Lag Manager is also enabled to provide dynamic log file play down for lagged copies when availability is compromised. By using the lagged database copy in this manner, it is important to understand that the lagged database copy is not a guaranteed point-in-time backup. The lagged database copy will have an availability threshold, typically around 90%, due to periods where the disk containing a lagged copy is lost due to disk failure, the lagged copy becoming an HA copy (due to automatic play down), as well as, the periods where the lagged database copy is re-building the replay queue. To protect against accidental (or malicious) item deletion, Single Item Recovery or In-Place Hold technologies are used, and the Deleted Item Retention window is set to a value that meets or exceeds any defined item-level recovery SLA. With all of these technologies in play, traditional backups are unnecessary; as a result, the PA leverages Exchange Native Data Protection. Office Online Server Design At a minimum, you will want to deploy two Office Online Servers in each datacenter that hosts Exchange 2016 servers. Each Office Online Server should have 8 processor cores, 32GB of memory and at least 40GB of space dedicated for log files. Note: The Office Online Server infrastructure does not need to be exclusive to Exchange. As such, the hardware guidance takes into account usage by SharePoint and Skype for Business. Be sure to work with any other teams using the Office Online Server infrastructure to ensure the servers are adequately sized for your specific deployment. The Exchange servers within a particular datacenter are configured to use the local Office Online Server farm via the following cmdlet: Set-MailboxServer <East MBX Server> –WACDiscoveryEndPoint https://oos-east.contoso.com/hosting/discovery Summary Exchange Server 2016 continues in the investments introduced in previous versions of Exchange by reducing the server role architecture complexity, aligning with the Preferred Architecture and Office 365 design principles, and improving coexistence with Exchange Server 2013. These changes simplify your Exchange deployment, without decreasing the availability or the resiliency of the deployment. And in some scenarios, when compared to previous generations, the PA increases availability and resiliency of your deployment. Ross Smith IV Principal Program Manager Office 365 Customer Experience264KViews1like20CommentsThe Preferred Architecture
During my session at the recent Microsoft Exchange Conference (MEC), I revealed Microsoft’s preferred architecture (PA) for Exchange Server 2013. The PA is the Exchange Engineering Team’s prescriptive approach to what we believe is the optimum deployment architecture for Exchange 2013, and one that is very similar to what we deploy in Office 365. While Exchange 2013 offers a wide variety of architectural choices for on-premises deployments, the architecture discussed below is our most scrutinized one ever. While there are other supported deployment architectures, other architectures are not recommended. The PA is designed with several business requirements in mind. For example, requirements that the architecture be able to: Include both high availability within the datacenter, and site resilience between datacenters Support multiple copies of each database, thereby allowing for quick activation Reduce the cost of the messaging infrastructure Increase availability by optimizing around failure domains and reducing complexity The specific prescriptive nature of the PA means of course that not every customer will be able to deploy it (for example, customers without multiple datacenters). And some of our customers have different business requirements or other needs, which necessitate an architecture different from that shown here. If you fall into those categories, and you want to deploy Exchange on-premises, there are still advantages to adhering as closely as possible to the PA where possible, and deviate only where your requirements widely differ. Alternatively, you can consider Office 365 where you can take advantage of the PA without having to deploy or manage servers. Before I delve into the PA, I think it is important that you understand a concept that is the cornerstone for this architecture – simplicity. Simplicity Failure happens. There is no technology that can change this. Disks, servers, racks, network appliances, cables, power substations, generators, operating systems, applications (like Exchange), drivers, and other services – there is simply no part of an IT services offering that is not subject to failure. One way to mitigate failure is to build in redundancy. Where one entity is likely to fail, two or more entities are used. This pattern can be observed in Web server arrays, disk arrays, and the like. But redundancy by itself can be prohibitively expensive (simple multiplication of cost). For example, the cost and complexity of the SAN based storage system that was at the heart of Exchange until the 2007 release, drove the Exchange Team to step up its investment in the storage stack and to evolve the Exchange application to integrate the important elements of storage directly into its architecture. We recognized that every SAN system would ultimately fail, and that implementing a highly redundant system using SAN technology would be cost-prohibitive. In response, Exchange has evolved from requiring expensive, scaled-up, high-performance SAN storage and related peripherals, to now being able to run on cheap, scaled-out servers with commodity, low-performance SAS/SATA drives in a JBOD configuration with commodity disk controllers. This architecture enables Exchange to be resilient to any storage related failure, while enabling you to deploy large mailboxes at a reasonable cost. By building the replication architecture into Exchange and optimizing Exchange for commodity storage, the failure mode is predictable from a storage perspective. This approach does not stop at the storage layer; redundant NICs, power supplies, etc., can also be removed from the server hardware. Whether it is a disk, controller, or motherboard that fails, the end result should be the same, another database copy is activated and takes over. The more complex the hardware or software architecture, the more unpredictable failure events can be. Managing failure at any scale is all about making recovery predictable, which drives the necessity to having predictable failure modes. Examples of complex redundancy are active/passive network appliance pairs, aggregation points on the network with complex routing configurations, network teaming, RAID, multiple fiber pathways, etc. Removing complex redundancy seems unintuitive on its face – how can removing redundancy increase availability? Moving away from complex redundancy models to a software-based redundancy model, creates a predictable failure mode. The PA removes complexity and redundancy where necessary to drive the architecture to a predictable recovery model: when a failure occurs, another copy of the affected database is activated. The PA is divided into four areas of focus: Namespace design Datacenter design Server design DAG design Namespace Design In the Namespace Planning and Load Balancing Principles articles, I outlined the various configuration choices that are available with Exchange 2013. From a namespace perspective, the choices are to either deploy a bound namespace (having a preference for the users to operate out of a specific datacenter) or an unbound namespace (having the users connect to any datacenter without preference). The recommended approach is to utilize the unbound model, deploying a single namespace per client protocol for the site resilient datacenter pair (where each datacenter is assumed to represent its own Active Directory site - see more details on that below). For example: autodiscover.contoso.com For HTTP clients: mail.contoso.com For IMAP clients: imap.contoso.com For SMTP clients: smtp.contoso.com Figure 1: Namespace Design Each namespace is load balanced across both datacenters in a configuration that does not leverage session affinity, resulting in fifty percent of traffic being proxied between datacenters. Traffic is equally distributed across the datacenters in the site resilient pair, via DNS round-robin, geo-DNS, or other similar solution you may have at your disposal. Though from our perspective, the simpler solution is the least complex and easier to manage, so our recommendation is to leverage DNS round-robin. In the event that you have multiple site resilient datacenter pairs in your environment, you will need to decide if you want to have a single worldwide namespace, or if you want to control the traffic to each specific datacenter pair by using regional namespaces. Ultimately your decision depends on your network topology and the associated cost with using an unbound model; for example, if you have datacenters located in North America and Europe, the network link between these regions might not only be costly, but it might also have high latency, which can introduce user pain and operational issues. In that case, it makes sense to deploy a bound model with a separate namespace for each region. Site Resilient Datacenter Pair Design To achieve a highly available and site resilient architecture, you must have two or more datacenters that are well-connected (ideally, you want a low round-trip network latency, otherwise replication and the client experience are adversely affected). In addition, the datacenters should be connected via redundant network paths supplied by different operating carriers. While we support stretching an Active Directory site across multiple datacenters, for the PA we recommend having each datacenter be its own Active Directory site. There are two reasons: Transport site resilience via Shadow Redundancy and Safety Net can only be achieved when the DAG has members located in more than one Active Directory site. Active Directory has published guidance that states that subnets should be placed in different Active Directory sites when the round trip latency is greater than 10ms between the subnets. Server Design In the PA, all servers are physical, multi-role servers. Physical hardware is deployed rather than virtualized hardware for two reasons: The servers are scaled to utilize eighty percent of resources during the worst-failure mode. Virtualization adds an additional layer of management and complexity, which introduces additional recovery modes that do not add value, as Exchange provides equivalent functionality out of the box. By deploying multi-role servers, the architecture is simplified as all servers have the same hardware, installation process, and configuration options. Consistency across servers also simplifies administration. Multi-role servers provide more efficient use of server resources by distributing the Client Access and Mailbox resources across a larger pool of servers. Client Access and Database Availability Group (DAG) resiliency is also increased, as there are more servers available for the load-balanced pool and for the DAG. Commodity server platforms (e.g., 2U, dual socket servers with no more than 24 processor cores and 96GB of memory, that hold 12 large form-factor drive bays within the server chassis) are use in the PA. Additional drive bays can be deployed per-server depending on the number of mailboxes, mailbox size, and the server’s scalability. Each server houses a single RAID1 disk pair for the operating system, Exchange binaries, protocol/client logs, and transport database. The rest of the storage is configured as JBOD, using large capacity 7.2K RPM serially attached SCSI (SAS) disks (while SATA disks are also available, the SAS equivalent provides better IO and a lower annualized failure rate). Bitlocker is used to encrypt each disk, thereby providing data encryption at rest and mitigating concerns around data theft via disk replacement. To ensure that the capacity and IO of each disk is used as efficiently as possible, four database copies are deployed per-disk. The normal run-time copy layout (calculated in the Exchange 2013 Server Role Requirements Calculator) ensures that there is no more than a single copy activated per-disk. Figure 2: Server Design At least one disk in the disk pool is reserved as a hot spare. AutoReseed is enabled and quickly restores database redundancy after a disk failure by activating the hot spare and initiating database copy reseeds. Database Availability Group Design Within each site resilient datacenter pair you will have one or more DAGs. DAG Configuration As with the namespace model, each DAG within the site resilient datacenter pair operates in an unbound model with active copies distributed equally across all servers in the DAG. This model provides two benefits: Ensures that each DAG member’s full stack of services is being validated (client connectivity, replication pipeline, transport, etc.). Distributes the load across as many servers as possible during a failure scenario, thereby only incrementally increasing resource utilization across the remaining members within the DAG. Each datacenter is symmetrical, with an equal number of member servers within a DAG residing in each datacenter. This means that each DAG contains an even number of servers and uses a witness server for quorum arbitration. The DAG is the fundamental building block in Exchange 2013. With respect to DAG size, a larger DAG provides more redundancy and resources. Within the PA, the goal is to deploy larger DAGs (typically starting out with an eight member DAG and increasing the number of servers as required to meet your requirements) and only create new DAGs when scalability introduces concerns over the existing database copy layout. DAG Network Design Since the introduction of continuous replication in Exchange 2007, Exchange has recommended multiple replication networks for separating client traffic from replication traffic. Deploying two networks allows you to isolate certain traffic along different network pathways and ensure that during certain events (e.g., reseed events) the network interface is not saturated (which is an issue with 100Mb, and to a certain extent, 1Gb interfaces). However, for most customers, having two networks operating in this manner was only a logical separation, as the same copper fabric was used by both networks in the underlying network architecture. With 10Gb networks becoming the standard, the PA moves away from the previous guidance of separating client traffic from replication traffic. A single network interface is all that is needed because ultimately our goal is to achieve a standard recovery model despite the failure - whether a server failure occurs or a network failure occurs, the result is the same, a database copy is activated on another server within the DAG. This architectural change simplifies the network stack, and obviates the need to eliminate heartbeat cross-talk. Witness Server Placement Ultimately, the placement of the witness server determines whether the architecture can provide automatic datacenter failover capabilities or whether it will require a manual activation to enable service in the event of a site failure. If your organization has a third location with a network infrastructure that is isolated from network failures that affect the site resilient datacenter pair in which the DAG is deployed, then the recommendation is to deploy the DAG’s witness server in that third location. This configuration gives the DAG the ability to automatically failover databases to the other datacenter in response to a datacenter-level failure event, regardless of which datacenter has the outage. Figure 3: DAG (Three Datacenter) Design If your organization does not have a third location, then place the witness server in one of the datacenters within the site resilient datacenter pair. If you have multiple DAGs within the site resilient datacenter pair, then place the witness server for all DAGs in the same datacenter (typically the datacenter where the majority of the users are physically located). Also, make sure the Primary Active Manager (PAM) for each DAG is also located in the same datacenter. Data Resiliency Data resiliency is achieved by deploying multiple database copies. In the PA, database copies are distributed across the site resilient datacenter pair, thereby ensuring that mailbox data is protected from software, hardware and even datacenter failures. Each database has four copies, with two copies in each datacenter, which means at a minimum, the PA requires four servers. Out of these four copies, three of them are configured as highly available. The fourth copy (the copy with the highest Activation Preference) is configured as a lagged database copy. Due to the server design, each copy of a database is isolated from its other copies, thereby reducing failure domains and increasing the overall availability of the solution as discussed in DAG: Beyond the “A”. The purpose of the lagged database copy is to provide a recovery mechanism for the rare event of system-wide, catastrophic logical corruption. It is not intended for individual mailbox recovery or mailbox item recovery. The lagged database copy is configured with a seven day ReplayLagTime. In addition, the Replay Lag Manager is also enabled to provide dynamic log file play down for lagged copies. This feature ensures that the lagged database copy can be automatically played down and made highly available in the following scenarios: When a low disk space threshold is reached When the lagged copy has physical corruption and needs to be page patched When there are fewer than three available healthy copies (active or passive) for more than 24 hours By using the lagged database copy in this manner, it is important to understand that the lagged database copy is not a guaranteed point-in-time backup. The lagged database copy will have an availability threshold, typically around 90%, due to periods where the disk containing a lagged copy is lost due to disk failure, the lagged copy becoming an HA copy (due to automatic play down), as well as, the periods where the lagged database copy is re-building the replay queue. To protect against accidental (or malicious) item deletion, Single Item Recovery or In-Place Hold technologies are used, and the Deleted Item Retention window is set to a value that meets or exceeds any defined item-level recovery SLA. With all of these technologies in play, traditional backups are unnecessary; as a result, the PA leverages Exchange Native Data Protection. Summary The PA takes advantage of the changes made in Exchange 2013 to simplify your Exchange deployment, without decreasing the availability or the resiliency of the deployment. And in some scenarios, when compared to previous generations, the PA increases availability and resiliency of your deployment. Ross Smith IV Principal Program Manager Office 365 Customer Experience232KViews0likes32CommentsAvailability Group Database Reports Not Synchronizing / Recovery Pending After Database Log File Inaccessible
First published on MSDN on Nov 29, 2017 You may find that one or more availability group databases is reported ‘Not Synchronizing / Recovery Pending’ on the primary replica or ‘Not Synchronizing’ on one of the secondary replicas.120KViews1like0CommentsConnect to SQL Server Using Application Intent Read-Only
First published on MSDN on Aug 02, 2013 Once you have configured your SQL Server availability group for read-only routing (see blog 'End to End - Using a Listener to Connect to a Secondary Replica (Read-Only Routing)') you must install the SQL Native Access Client (SNAC) provider that supports application intent connections and you must write your application using the correct and necessary connection properties, to successfully connect to the secondary read-only replica.81KViews0likes0CommentsLagged Database Copy Enhancements in Exchange Server 2016 CU1
The high availability capabilities of the lagged database copy are enhanced in the upcoming release of Exchange 2016 Cumulative Update 1. ReplayLagManager As you may recall, lagged copies can care for themselves by invoking automatic log replay to play down the log files in certain scenarios: When a low disk space threshold (10,000MB) is reached When the lagged copy has physical corruption and needs to be page patched When there are fewer than three available healthy HA copies for more than 24 hours Play down based on health copy status requires ReplayLagManager to be enabled. Beginning with Exchange 2016 CU1, ReplayLagManager is enabled by default. You can change this via the following command: Set-DatabaseAvailabilityGroup <DAGName> -ReplayLagManagerEnabled $false Deferred Lagged Copy Play Down When one of the above conditions is triggered, the Replication Service will initiate a play down event for the lagged database copy. However, there are times where this may not be ideal. For example, consider the scenario where there are four database copies on a disk, one passive, one lagged, and two active. Initiating a play down event on the lagged copy has the potential to impact any active copies on that disk – replaying log files generates IO and introduces disk latency as the disk head moves, which impacts users accessing their data on the active copies. To address this concern, beginning with Cumulative Update 1 for Exchange 2016, the lagged copy’s play down activity is tied to the health of the disk by evaluating the disk’s IO latency: If the disk’s read IO latency is above 35ms, the play down event is deferred. In the event that there is a disk capacity concern, the disk latency deferral will be ignored and the lagged copy will play down. Once the disk’s read IO latency drops below 25ms, the play down event is resumed. As a result, deferred lagged copy play down reduces the IO burstiness of lagged copy play down events and ensures that local active copies on the lagged copies disk are not affected. IO sizing of a lagged database copy does not change with this feature (nor does it affect the IO sizing of an active copy); you still must ensure there is available IO headroom in the event that the lagged copy becomes active. Consider the following example: The y axis is disk latency, measured in milliseconds. The x axis is a 24-hour period. As you can see from the graph, between the hours of 1am to 9am, the disk IO latency is below 25ms, meaning that lagged copy replay is allowed. At 10am, the latency exceeds 35ms and this continues until about 2pm; during this time period, lagged copy replay is delayed or deferred. At 2pm, the latency drops below 25ms and lagged copy replay resumes. Latency increases again at 4pm and the process repeats itself. By default, the maximum amount of time that a play down event can be deferred is 24 hours. You can adjust this via the following command: Set-MailboxDatabaseCopy <database name\server> -ReplayLagMaxDelay:<value in the format of 00:00:00> If you want to disable deferred play down, you can set the ReplayLagMaxDelay value to ([TimeSpan]::Zero). The following events are recorded in the Microsoft-Exchange-HighAvailability/Monitoring crimson channel when log replay is deferred or resumed: Event 750 – Replay Lag Manager requested activating replay lag delay (suspending log replay) for database copy '%1\%2' after a suppression interval of %4. Delay Reason: %6" Event 751 – Replay Lag Manager successfully activated replay lag delay (suspended log replay) for database copy '%1\%2'. Delay Reason: %4" Event 752 – Replay Lag Manager failed to activate replay lag delay (suspend log replay) for database copy '%1\%2'. Error: %4" Event 753 – Replay Lag Manager requested deactivating replay lag (resuming log replay) for database copy '%1\%2' after a suppression interval of %4. Reason: %5" Event 754 – Replay Lag Manager successfully deactivated replay lag (resumed log replay) for database copy '%1\%2'. Reason: %4 Event 755 - Replay Lag Manager failed to deactivate replay lag (resume log replay) for database copy '%1\%2'. Error: %4 Event 756 - Replay Lag Manager will attempt to deactivate replay lag (resume log replay) for database copy '%1\%2' because it has reached the maximum allowed lag duration. Detailed Reason: %5 The following events are recorded in the Microsoft-Exchange-HighAvailability/Operational crimson channel when log replay is deferred or resumed: Event 748 – Log Replay suspend/resume state for database '%1' has changed. (LastSuspendReason=%3, CurrentSuspendReason=%4, CurrentSuspendReasonMessage=%5) Event 2050 – Suspend log replay requested for database guid=%1, reason='%2'. Event 2051 – Suspend log replay for database guid=%1 succeeded. Event 2052 – Suspend log replay for database guid=%1 failed: %2. Event 2053 – Resume log replay requested for database guid=%1. Event 2054 – Resume log replay for database guid=%1 succeeded. Event 2055 – Resume log replay for database guid=%1 failed: %2. Summary The changes discussed above continue our work in improving the Preferred Architecture by ensuring that users have the best possible experience on the Exchange platform. As always, we welcome your feedback. Ross Smith IV Principal Program Manager Office 365 Customer Experience60KViews0likes8CommentsDAG Activation Preference Behavior Change in Exchange Server 2016 CU2
Every copy of a mailbox database in a DAG is assigned an activation preference number. This number is used by the system as part of the passive database activation process, and by administrators when performing database balancing operations for a DAG. This number is expressed as the ActivationPreference property of a mailbox database copy. The value for the ActivationPreference property is a number equal to or greater than 1, where 1 is at the top of the preference order. When a DAG is first implemented, by default all active database copies have an ActivationPreference of 1. However, due to the inherent nature of DAGs (e.g., databases experience switchovers and failovers), active mailbox database copies will change hosts several times throughout a DAG's lifetime. As a result of this inherent behavior, a mailbox database may remain active on a database copy which is the not the most preferred copy. Prior to Exchange 2016 Cumulative Update 2 (CU2), Exchange Server administrators had to either manually activate their preferred database copy, or use the RedistributeActiveDatabases.ps1 script to balance the databases copies across a DAG. Starting with CU2 (which will be releasing soon), the Primary Active Manager in the DAG performs periodic discretionary moves to activate the copy that the administrator has defined as most preferred is now built into the product. A new DAG property called PreferenceMoveFrequency has been added that defines the frequency (measured in time) when the Microsoft Exchange Replication service will rebalance the database copies by performing a lossless switchover that activates the copy with an ActivationPreference of 1 (assuming the target server and database copy are healthy). Note: In order to take advantage of this feature, ensure all Mailbox servers within the DAG are upgraded to Exchange 2016 CU2. By default, the Replication service will inspect the database copies and perform a rebalance every one hour. You can modify this behavior using the following command: Set-DatabaseAvailabilityGroup <Name> -PreferenceMoveFrequency <value in the format of 00:00:00> To disable this behavior, configure the PreferenceMoveFrequency value to ([System.Threading.Timeout]::InfiniteTimeSpan). If you are leaving the behavior enabled, and you have created a scheduled task to execute RedistributeActiveDatabases.ps1, you can remove the scheduled task after upgrading the DAG to CU2. We recommend taking advantage of this behavior to ensure that your DAG remains optimally balanced. This feature continues our work to improve the Preferred Architecture by ensuring that users have the best possible experience on Exchange Server. As always, we welcome your feedback. Ross Smith IV Principal Program Manager Office 365 Customer Experience Updates 6/21/16: Updated information on how to disable PreferenceMoveFrequency without requiring a Replication service restart. If you set it to [Timespan]::Zero, you will need to cycle the Replication service.59KViews0likes34CommentsConnection Timeouts in Multi-subnet Availability Group
First published on MSDN on Jun 03, 2014 THE DEFINITION One of the issues that generates a lot of call volume we see on the AlwaysOn team is dealing with connectivity issues to the availability group listener in multi-subnet environments.57KViews0likes2CommentsNew Support Policy for Repaired Exchange Databases
The database repair process is often used as a last ditch effort to recover an Exchange database when no other means of recovery is available. The process should only be followed at the advice of Microsoft Support and after determining that all other recovery options have been exhausted. For many years in many versions of Exchange, the repair process has largely been the same. However, that process is changing, based on information Microsoft has gathered from an extensive analysis of support cases. In short, Microsoft is changing the support policy for databases that have had a repair operation performed on them. Originally a database was supported if the repair was performed using ESEUTIL and ISINTEG/repair cmdlets. Under the new support policy, any database where the repair count is greater than 0 will need to be evacuated – all mailboxes on such a database will need to be moved to a new database. Existing Repair Process The process consists of three steps: Repair the database at the page level Defragmentation of the database to restructure and recreate the database Repair of the logical structures within the database Step 1 of the repair process is accomplished by using ESEUTIL /p. This is typically performed when there is page level corruption in the database - for example, a -1018 JET error, or when a database is left in dirty shutdown state as the result of not having the necessary log files to bring the database to a clean shutdown state. After executing ESEUTIL /p you are prompted to confirm that data loss may result. Selecting OK is required to continue. [PS] C:\Program Files\Microsoft\Exchange Server\Mailbox\First Storage Group>eseutil /p '.\Mailbox Database.edb' Extensible Storage Engine Utilities for Microsoft(R) Exchange Server Version 08.03 Copyright (C) Microsoft Corporation. All Rights Reserved. Initiating REPAIR mode... Database: .\Mailbox Database.edb Temp. Database: TEMPREPAIR4520.EDB Checking database integrity. The database is not up-to-date. This operation may find that this database is corrupt because data from the log files has yet to be placed in the database. To ensure the database is up-to-date please use the 'Recovery' operation. Scanning Status (% complete) 0 10 20 30 40 50 60 70 80 90 100 |----|----|----|----|----|----|----|----|----|----| . Rebuilding MSysObjectsShadow from MSysObjects. Scanning Status (% complete) 0 10 20 30 40 50 60 70 80 90 100 |----|----|----|----|----|----|----|----|----|----| ................................................... Checking the database. Scanning Status (% complete) 0 10 20 30 40 50 60 70 80 90 100 |----|----|----|----|----|----|----|----|----|----| ................................................... Scanning the database. Scanning Status (% complete) 0 10 20 30 40 50 60 70 80 90 100 |----|----|----|----|----|----|----|----|----|----| ................................................... Repairing damaged tables. Scanning Status (% complete) 0 10 20 30 40 50 60 70 80 90 100 |----|----|----|----|----|----|----|----|----|----| ................................................... Repair completed. Database corruption has been repaired! Note: It is recommended that you immediately perform a full backup of this database. If you restore a backup made before the repair, the database will be rolled back to the state it was in at the time of that backup. Operation completed successfully with 595 (JET_wrnDatabaseRepaired, Database corruption has been repaired) after 30.187 seconds. At this point, the database should be in a clean shutdown state and the repair process may proceed. This can be verified with ESEUTIL /mh. [PS] C:\Program Files\Microsoft\Exchange Server\Mailbox\First Storage Group>eseutil /mh '.\Mailbox Database.edb' State: Clean Shutdown Step 2 is to defragment the database using ESEUTIL /d. Defragmentation requires significant free space on the volume that will host the temporary database (typically 110% of the size of the database must be available as free disk space). [PS] C:\Program Files\Microsoft\Exchange Server\Mailbox\First Storage Group>eseutil /d '.\Mailbox Database.edb' Extensible Storage Engine Utilities for Microsoft(R) Exchange Server Version 08.03 Copyright (C) Microsoft Corporation. All Rights Reserved. Initiating DEFRAGMENTATION mode... Database: .\Mailbox Database.edb Defragmentation Status (% complete) 0 10 20 30 40 50 60 70 80 90 100 |----|----|----|----|----|----|----|----|----|----| ................................................... Moving 'TEMPDFRG3620.EDB' to '.\Mailbox Database.edb'... DONE! Note: It is recommended that you immediately perform a full backup of this database. If you restore a backup made before the defragmentation, the database will be rolled back to the state it was in at the time of that backup. Operation completed successfully in 7.547 seconds. Step 3 is the logical repair of the objects within the database. The method used to accomplish this varies by Exchange version. In Exchange 2007, ISINTEG is used to perform the logical repair, as illustrated in the following example: C:\>isinteg -s wingtip-e2k7 -fix -test alltests -verbose -l c:\isinteg.log Databases for server wingtip-e2k7: Only databases marked as Offline can be checked Index Status Database-Name Storage Group Name: First Storage Group 1 Offline Mailbox Database Enter a number to select a database or press Return to exit. 1 You have selected First Storage Group / Mailbox Database. Continue?(Y/N)y Test Categorization Tables result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time: 0h:0m:0s Test Restriction Tables result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time: 0h:0m:0s Test Search Folder Links result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s);time: 0h:0m:0s Test Global result: 0 error(s); 0 warning(s); 0 fix(es); 1 row(s); time: 0h:0m:0s Test Delivered To result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time: 0h:0m:0s Test Repl Schedule result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time:0h:0m:0s Test Timed Events result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time: 0h:0m:0s Test reference table construction result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time: 0h:0m:0s Test Folder result: 0 error(s); 0 warning(s); 0 fix(es); 4996 row(s); time: 0h:0m:2s Test Deleted Messages result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time: 0h:0m:0s Test Message result: 0 error(s); 0 warning(s); 0 fix(es); 1789 row(s); time: 0h:0m:0s Test Attachment result: 0 error(s); 0 war ning(s); 0 fix(es); 406 row(s); time: 0h:0m:0s Test Mailbox result: 0 error(s); 0 warning(s); 0 fix(es); 249 row(s); time: 0h:0m:0s Test Sites result: 0 error(s); 0 warning(s); 0 fix(es); 996 row(s); time: 0h:0m:0s Test Categories result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time: 0h:0m:0s Test Per-User Read result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time:0h:0m:0s Test special folders result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time: 0h:0m:0s Test Message Tombstone result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time: 0h:0m:0s Test Folder Tombstone result: 0 error(s); 0 warning(s); 0 fix(es); 0 row(s); time: 0h:0m:0s Now in test 20(reference count verification) of total 20 tests; 100% complete. Typically when ISINTEG completes, it advises reviewing the isinteg.log file. At the end of the file is a summary section, listing the number of errors encountered. If the number of errors is greater than zero, you need to re-run the command. Continued repairs need to be performed until the error count reaches 0 or the same number of errors is encountered after two executions. . . . . . SUMMARY . . . . . Total number of tests : 20 Total number of warnings : 0 Total number of errors : 0 Total number of fixes : 0 Total time : 0h:0m:3s In Exchange 2010 and later, ISINTEG was deprecated and certain functions were replaced by the New-MailboxRepairRequest and New-PublicFolderDatabaseRepairRequest cmdlets, both of which allow for repair operations to occur while the database is online. Exchange 2010: [PS] C:\Windows\system32>New-MailboxRepairRequest -Mailbox user252 -CorruptionType SearchFolder,FolderView,AggregateCounts,ProvisionedFolder,MessagePtagCN,MessageID RequestID Mailbox ArchiveMailbox Database Server --------- ------- -------------- -------- ------ 7f499ce3-e Wingtip False Mailbox. WINGTIP-E2K10.Wingti... Exchange 2013: [PS] C:\>New-MailboxRepairRequest -Mailbox User532 -CorruptionType SearchFolder,FolderView,AggregateCounts, ProvisionedFolder,ReplState,MessagePTAGCn,MessageID,RuleMessageClass,RestrictionFolder,FolderACL, UniqueMidIndex,CorruptJunkRule,MissingSpecialFolders,DropAllLazyIndexes,ImapID,ScheduledCheck,Extension1, Extension2,Extension3,Extension4,Extension5 Identity Task Detect Only Job State Progress -------- ---- ----------- --------- -------- a44acf2b {Sea False Queued 0 Upon completion of these repair options, typically the database could be mounted and normal user operations continued. Support Change for Repaired Databases Over the course of the last two years, we have reviewed Watson dumps for Information Store crashes that have been automatically uploaded by customers’ servers. The crashes were cause by inexplicable, seemingly impossible store level corruption. The types of store level corruption varied and they come from many different databases, servers, Exchange versions, and customers. In almost all of these cases one significant fact was noted – the repair count recorded on the database was > 0. When ESEUTIL /p is executed, and a repair to the database is necessary, the repair count is incremented and the repair time is recorded in the header of the database. The repair information stored in the database header will be retained after offline defragmentation . Repair information in the header may be viewed with ESEUTIL /mh. [PS] C:\Program Files\Microsoft\Exchange Server\Mailbox\First Storage Group>eseutil /mh '.\Mailbox Database.edb' Extensible Storage Engine Utilities for Microsoft(R) Exchange Server Version 08.03 Copyright (C) Microsoft Corporation. All Rights Reserved. Initiating FILE DUMP mode... Database: .\Mailbox Database.edb File Type: Database Format ulMagic: 0x89abcdef Engine ulMagic: 0x89abcdef Format ulVersion: 0x620,12 Engine ulVersion: 0x620,12 Created ulVersion: 0x620,12 DB Signature: Create time:04/05/2015 08:39:24 Rand:2178804664 Computer: cbDbPage: 8192 dbtime: 1059112 (0x102928) State: Clean Shutdown Log Required: 0-0 (0x0-0x0) Log Committed: 0-0 (0x0-0x0) Streaming File: No Shadowed: Yes Last Objid: 4020 Scrub Dbtime: 0 (0x0) Scrub Date: 00/00/1900 00:00:00 Repair Count: 2 Repair Date: 04/05/2015 08:39:24 Old Repair Count: 0 Last Consistent: (0x0,0,0) 04/05/2015 08:39:25 Last Attach: (0x0,0,0) 04/05/2015 08:39:24 Last Detach: (0x0,0,0) 04/05/2015 08:39:25 Dbid: 1 Log Signature: Create time:00/00/1900 00:00:00 Rand:0 Computer: OS Version: (6.1.7601 SP 1 NLS 60101.60101) Previous Full Backup: Log Gen: 0-0 (0x0-0x0) Mark: (0x0,0,0) Mark: 00/00/1900 00:00:00 Previous Incremental Backup: Log Gen: 0-0 (0x0-0x0) Mark: (0x0,0,0) Mark: 00/00/1900 00:00:00 Previous Copy Backup: Log Gen: 0-0 (0x0-0x0) Mark: (0x0,0,0) Mark: 00/00/1900 00:00:00 Previous Differential Backup: Log Gen: 0-0 (0x0-0x0) Mark: (0x0,0,0) Mark: 00/00/1900 00:00:00 Current Full Backup: Log Gen: 0-0 (0x0-0x0) Mark: (0x0,0,0) Mark: 00/00/1900 00:00:00 Current Shadow copy backup: Log Gen: 0-0 (0x0-0x0) Mark: (0x0,0,0) Mark: 00/00/1900 00:00:00 cpgUpgrade55Format: 0 cpgUpgradeFreePages: 0 cpgUpgradeSpaceMapPages: 0 ECC Fix Success Count: none Old ECC Fix Success Count: none ECC Fix Error Count: none Old ECC Fix Error Count: none Bad Checksum Error Count: none Old bad Checksum Error Count: none Operation completed successfully in 0.78 seconds. Uncorrectable corruption can linger in a repaired database and cause store crashes and server instability, we have changed our support policy to require an evacuation of any Exchange database that persistently has a repair count or old repair count equal to or greater than 1. Moving mailboxes (and public folders) to new databases will ensure that the underlying database structure is good, free from any corruption that might not be corrected by the database repair process, and it helps prevent store crashes and server instability. Tim McMichael57KViews0likes9CommentsTroubleshooting REDO queue build-up (data latency issues) on AlwaysOn Readable Secondary Replicas using the WAIT_INFO Extended Event
First published on MSDN on Jan 06, 2015 PROBLEM You have confirmed excessive build up of REDO queue on an AlwaysOn Availability Group secondary replica by one of the following methods: Querying sys.53KViews1like5Comments