APIM Active Repair Items

Sev 1-2 incidents · Last 365 days · Open items only · Generated: 2026-05-29 19:03 UTC
500Total Items
500Visible
180Unique Incidents
170Unassigned
Work Item Description Status Owner ETA Sev IcM Incident Title Outage Team Created
37657942 Concurrent service-creation requests bypass per-subscription service-limit caps (TOCTOU race) In Review Saikiran Vukyam Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-27
37515023 BRAIN should detect anomalous traffic and provide that as an additional signal To Do Shilpa Mani Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-13
37511203 Add/update TSG for customer subscription lookup Committed Neha Gupta Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-13
37401353 Ensure new beta feature are logged in GetBetaFeaturesDetails Kusto function instantly, instead of 6h wait New Unassigned Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37396736 Visualize DB connection counter in SMAPI dashboard to visualize New Vitalii Kurokhtin Within 14 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37396719 Existing SMAPI alerts for DB CPU did not fire and should use Sev2 New Vitalii Kurokhtin Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37396592 Limiting # of allowed services by Azure subscription type (ie trial limited to 5 services) New Shilpa Mani Within 14 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37396574 Extend SMAPI dashboard to provide deep link to Antares detector for scale units New Vitalii Kurokhtin Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37396567 Provide alert for high response times in SMAPI New Vitalii Kurokhtin Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37396561 Provide alert for connection starvation of DB in SMAPI New Vitalii Kurokhtin Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37396544 Provide alert for traffic increase blocked by RP throttling New Samir Solanki Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37396543 Provide ACIS to fully block tenant/subscription New Shilpa Mani Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37396540 TSG for IMs on how to handle DDOS/bad actors, engage CDOC, lock our actor New Tom Kerkhove Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37167427 RP Throttle: Implement throttling for SubscriptionId In Review Samir Solanki Within 30 days 1 772581567 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-04-03
37132082 RP endpoints should accept boolean input as well as stringified boolean New Unassigned Within 60 days 1 738209211 [Emerging Issue] DEV SKU: Unable to create self-hosted gateway in version 0.50.x... Yes Backend 2026-03-17
37132044 Alert on problems when creating self hosted gateway (4xx response is returned) New Unassigned Within 30 days 1 738209211 [Emerging Issue] DEV SKU: Unable to create self-hosted gateway in version 0.50.x... Yes Backend 2026-03-17
37132028 Improve test fixtures, to contain same entities as a newly created service New Unassigned Within 30 days 1 738209211 [Emerging Issue] DEV SKU: Unable to create self-hosted gateway in version 0.50.x... Yes Backend 2026-03-17
1862435 Improve Brain Coverage by integrating [Monitor Type]: $Monitor_Name (Service Health Repair) New shellyg 1 1 695770070 [S500][CSS] -AOAI-Reliability [CareCore National LLC] [Encountered high latency ... Yes Backend 2025-10-12
38141678 [RP] Investigate and fix rollback scenarios for Upgrade/Update orchestration, so that incorrect configuration is not pre... New Unassigned Within 30 days 2 51000001038159 API Management service is down due to an unknown reason - Network connectivity No Backend 2026-05-28
38141635 [SKUv1][ProxyInfra] Inconsistency in logged DeploymentVersion - actual version is different from the logged one New Unassigned Within 30 days 2 51000001038159 API Management service is down due to an unknown reason - Network connectivity No Backend 2026-05-28
38105964 Integrate Service Monitor '[Public] More than 4 customer orchestrations failed due to lack ' with Brain for Auto Outage ... To Do Unassigned 1 2 753777704 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2026-05-26
38105963 Integrate Service Monitor '[Public] [Gateway] NonVNET- Gateway 100% not reachable with 2+ a' with Brain for Auto Outage ... To Do Unassigned 1 2 757255502 [Public] [AutoComms] swedencentral - SkuV2 - BasicOrStandard - Impacted services... Yes Backend 2026-05-26
38078783 Onboard SMAPI DependencyComponents New Unassigned Within 30 days 2 793601657 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Platform(InternalOnly) 2026-05-22
38051028 Refactor ContractSecretsCleaner to handler-based architecture with backend credential cleanup In Review Rafal Mielowski Within 60 days 2 31000000587084 [MSRC] [113613] [SMAPI] - Azure API Management - Microsoft.ApiManagement/service... No Backend 2026-05-21
38058878 Runner/BVT validation in EUAP regions with alerting New Unassigned Within 14 days 2 793601657 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Platform(InternalOnly) 2026-05-21
38058871 Implement SMAPI 5XX monitoring New Unassigned Within 14 days 2 793601657 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Platform(InternalOnly) 2026-05-21
38057982 Make reboot cluster operation more resilient New Unassigned Within 14 days 2 793601657 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Platform(InternalOnly) 2026-05-21
38057971 Cluster app settings update must recycle the app New Unassigned Within 14 days 2 793601657 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Platform(InternalOnly) 2026-05-21
38057931 Remove dependency to regional global storage account for consumption SMAPI New Unassigned Within 30 days 2 793601657 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Platform(InternalOnly) 2026-05-21
38051893 [MSRC] [116794] Fix Soap action override issue as per msrc repro In Review Branimir Giurov Within 14 days 2 31000000605020 [MSRC] [116794] - Azure API Management - managed gateway operation matcher - SOA... No Backend 2026-05-20
38045651 Change AOAI tenant rollout to use max surge (but make configurable in pipeline) New Tom Kerkhove Within 30 days 2 796152710 [CallResult.OpenAI]APIM GatewayOverhead latency monitor for [Region:Sweden Centr... No Gateway 2026-05-20
38043152 Extend quota management job to request total regional core quota for a sub in a region on low capacity To Do Neha Gupta Within 30 days 2 789035878 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2026-05-20
38043119 Extend RCM Read Only Manager to consider total regional cores for closing/opening pools To Do Neha Gupta Within 30 days 2 789035878 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2026-05-20
38043107 Track Total Regional vCPUs (cores) across all SKUs in a region To Do Neha Gupta Within 30 days 2 789035878 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2026-05-20
38040479 Update Troubleshooting Guide to handle the version mismatches and CRP MaxSurge errors In Review Srinivas Anandapu Within 14 days 2 793537192 Emerging Issue - Administrative Service Update Operations Start Failing with VMS... No Backend 2026-05-20
38040439 Auto-Rollback on Migration Failure In Review Srinivas Anandapu Within 14 days 2 793537192 Emerging Issue - Administrative Service Update Operations Start Failing with VMS... No Backend 2026-05-20
38040427 Proactive Version Mismatch Monitoring and CRP MaxSurge Error Alerting In Review Srinivas Anandapu Within 14 days 2 793537192 Emerging Issue - Administrative Service Update Operations Start Failing with VMS... No Backend 2026-05-20
38034482 [RP] HttpClient anti-pattern in HealthCheckPingUtilities causes high CPU from ServicePoint accumulation New Samir Solanki Within 30 days 2 797427647 [api-kw1-prod-01-rp] API RP Orchestration Alert: RegionalRpHealthMonitorJob Orch... No Backend 2026-05-19
38013722 Try-catch errors/timeouts originating from Microsoft.Advisor calls in the Service Overview Blade. New Javier Borrego Within 30 days 2 800074452 [Public] [AzurePortalWAWSAlert] Blades failed to load at least 1 time for 5 user... No AzurePortal 2026-05-18
38011737 Ensure BRAIN detects broken devportal signup flow To Do Unassigned Within 30 days 2 788655236 Emerging Issue: Developer Portal User Registration Fails with error "User r... Yes Backend 2026-05-18
38010967 Make backends credentials field cleanup done by default To Do Rafal Mielowski Within 60 days 2 31000000587084 [MSRC] [113613] [SMAPI] - Azure API Management - Microsoft.ApiManagement/service... No Backend 2026-05-18
38009312 Automatically partition DNS zones across multiple subscriptions, rather than a single subscription New Tom Kerkhove Within 30 days 2 780717094 Australia East activations can fail for Workspace & AOAI Hub gateway due to DNS ... No Backend 2026-05-18
38001706 Improve health monitor dashboard to show queue error details and average processing time New Samir Solanki Within 30 days 2 797427647 [api-kw1-prod-01-rp] API RP Orchestration Alert: RegionalRpHealthMonitorJob Orch... No Backend 2026-05-18
38001445 Re-architect way we process health monitoring for services to be isolated from each other and improve scalability Active Tom Kerkhove Within 14 days 2 797427647 [api-kw1-prod-01-rp] API RP Orchestration Alert: RegionalRpHealthMonitorJob Orch... No Backend 2026-05-18
38001388 Improve health check partitioning for AOAI Hub to not do by subscription ID To Do Unassigned Within 14 days 2 797427647 [api-kw1-prod-01-rp] API RP Orchestration Alert: RegionalRpHealthMonitorJob Orch... No Backend 2026-05-18
38001068 Code fix: Gateway.IoC.Dedicated\CacheModule.cs creates its own inline new RedisDistributedCacheProviderSettings() WITHOU... New Max Podriezov Within 14 days 2 781005163 [AOAIFEAzureMonitorIncident][APIM] High Redis Latency on High Average Redis Late... No ServicingLoop 2026-05-18
37986778 Update custom domain should avoid delete/recreate To Do Nina Ren Within 30 days 2 51000001023424 APIM Service not refreshing the certificate for the traffic manger profile under... No Backend 2026-05-15
37981817 Create a P0 Scenario alert ensuring the entra ID user sign-up flow works with basic auth disabled To Do Unassigned Within 30 days 2 788655236 Emerging Issue: Developer Portal User Registration Fails with error "User r... Yes Backend 2026-05-15
37981344 [RP] Register Microsoft.Compute/OptInScaleSetLargerMaxPipelinePreemptionCount on APIM-managed subscriptions To Do Martin Dechev Within 30 days 2 51000000982831 APIM stuck on updating state for over 6 hours. All of our critical public facing... No Platform(InternalOnly) 2026-05-15
37981343 [RP] Detect 'DSC never fired' early during VMSS rolling upgrade and abort+retry instead of waiting 3h To Do Martin Dechev Within 30 days 2 51000000982831 APIM stuck on updating state for over 6 hours. All of our critical public facing... No Platform(InternalOnly) 2026-05-15
37981342 [RP] Resilient VMSS rolling upgrade cancel: poll RU state after Cancel + bounded retry with backoff To Do Martin Dechev Within 30 days 2 51000000982831 APIM stuck on updating state for over 6 hours. All of our critical public facing... No Platform(InternalOnly) 2026-05-15
37979486 Consumption APIM improve readiness signalling to fix hostname binding drops To Do Unassigned Within 60 days 2 51000000969619 APIM Down No Platform(InternalOnly) 2026-05-15
37955320 Remove AppService endpoints from Geneva Actions To Do Unassigned Within 14 days 2 797427647 [api-kw1-prod-01-rp] API RP Orchestration Alert: RegionalRpHealthMonitorJob Orch... No Backend 2026-05-14
37947111 Detection for grandfathered limits not applied To Do Unassigned Within 14 days 2 787095677 Emerging Issue: 'Grandfathered' Limits no longer being applied after upgrade to ... No Backend 2026-05-13
37940964 Detection WI for User Registration Success Rate To Do Unassigned Within 14 days 2 788655236 Emerging Issue: Developer Portal User Registration Fails with error "User r... Yes Backend 2026-05-13
37937940 Automated test for inconsistent APIM cache responses (repair item for IcM #21000001017622) To Do Nima Kamoosi Within 30 days 2 21000001017622 Inconsistent Responses Observed for Azure APIM Cache No Backend 2026-05-12
37335159 Support for RollingRecycle of the RP App without causing machine restart In Review Samir Solanki Within 14 days 2 793601657 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Platform(InternalOnly) 2026-05-12
37928771 Platform should not allow OS updates during business hours / based on service window New Unassigned Within 14 days 2 796152710 [CallResult.OpenAI]APIM GatewayOverhead latency monitor for [Region:Sweden Centr... No Gateway 2026-05-12
37928586 Configure service windows on AOAI Hub instances New Unassigned Within 14 days 2 796152710 [CallResult.OpenAI]APIM GatewayOverhead latency monitor for [Region:Sweden Centr... No Gateway 2026-05-12
37927470 Rollback background refresh for all of AOAI To Do Unassigned Within 14 days 2 795262147 PROD - ML Workload Reliability dip below 95% for GetOpenAICompletionsResponseAsy... No Backend 2026-05-12
37907626 Change AOAI Hub SRE agent to handle incidents from AOAI FE team New Unassigned Within 14 days 2 795262147 PROD - ML Workload Reliability dip below 95% for GetOpenAICompletionsResponseAsy... No Backend 2026-05-11
37907605 Surface background refresh information in Redis dashboard New Max Podriezov Within 14 days 2 795262147 PROD - ML Workload Reliability dip below 95% for GetOpenAICompletionsResponseAsy... No Backend 2026-05-11
37907590 Provide TSG on how to analyse cache usage, where to look, what to look for and guide DRIs Active Max Podriezov Within 14 days 2 795262147 PROD - ML Workload Reliability dip below 95% for GetOpenAICompletionsResponseAsy... No Backend 2026-05-11
37907581 Introduce TSG on how to rollout settings changes for AOAI with EV2 To Do Unassigned Within 14 days 2 795262147 PROD - ML Workload Reliability dip below 95% for GetOpenAICompletionsResponseAsy... No Backend 2026-05-11
37907532 Introduce TSG on how background refresh works and how to disable New Dean Ward Within 14 days 2 795262147 PROD - ML Workload Reliability dip below 95% for GetOpenAICompletionsResponseAsy... No Backend 2026-05-11
37907527 Introduce dedicated alert for background refresh to identify issues during refresh causing stale data New Dean Ward Within 14 days 2 795262147 PROD - ML Workload Reliability dip below 95% for GetOpenAICompletionsResponseAsy... No Backend 2026-05-11
37907526 Only enable background refresh in TIP & Canary, for now To Do Unassigned Within 14 days 2 795262147 PROD - ML Workload Reliability dip below 95% for GetOpenAICompletionsResponseAsy... No Backend 2026-05-11
37904999 [Mitigation] Extend ReplaceVM auto-heal to trigger on per-instance anomalies (CCF, BackendConnectionFailure) vs. peers To Do Unassigned Within 30 days 2 51000001005278 Requests from APPGW to APIM - 504 Timeout No Backend 2026-05-11
37904987 [Detection] Availability Alert - Add per-RoleInstance ClientConnectionFailure (CCF) alert to detect single-VM degradatio... To Do Unassigned Within 14 days 2 51000001005278 Requests from APPGW to APIM - 504 Timeout No Backend 2026-05-11
37888522 [Throttling] Enable Gateway Throttling by default To Do Branimir Giurov Within 30 days 2 21000000998761 API Management (APIM) service is down No Platform(InternalOnly) 2026-05-08
37876964 [Throttling] Apply TLS throttling in http.sys To Do Branimir Giurov Within 30 days 2 21000000998761 API Management (APIM) service is down No Platform(InternalOnly) 2026-05-08
37866452 Improve alerting: cascading failure not caught until customer reported (IcM 21000000998761) New Unassigned Within 60 days 2 21000000998761 API Management (APIM) service is down No Platform(InternalOnly) 2026-05-07
37863918 Ev2 Global RP Pipeline broken with mismatch of region in StageMAp New Samir Solanki Within 14 days 2 792769102 APIM RP Fairfax Orchestration Failure - Azure Gov Unified customer unable to upd... No Backend 2026-05-07
37851468 [Security] Restore OAuth/OIDC credential redaction in API export — fix removed from main by merge cleanup New Macko Treder Within 30 days 2 31000000591485 [MSRC] [114268] - Azure API Management - <instance_name>.management.azure-... No Backend 2026-05-06
37836882 Run dummy upgrade for workspace gateway when certificate expiration triggers New Unassigned Within 60 days 2 51000000976727 Issue in connecting to APIM workspace gateway-2604090040002863 No SMAPI 2026-05-05
37836867 Create alert on certificate expiration for workspace gateway New Unassigned Within 60 days 2 51000000976727 Issue in connecting to APIM workspace gateway-2604090040002863 No SMAPI 2026-05-05
37835839 Add guardrail to check for breaking config when rollback New Unassigned Within 30 days 2 778045793 Content safety timeout/failure No Backend 2026-05-05
37825940 [Troubleshooting] Update SRE/livesite agent to detect & report abnormal nodes (per-RoleInstance CCF/BackendConnectionFai... To Do Shilpa Mani Within 14 days 2 51000001002629 UNIFIED STRATEGIC | 2604290030005467 | Intermittent connection failure Issue No Platform(InternalOnly) 2026-05-05
37825942 [Mitigation] Extend ReplaceVM auto-heal to trigger on per-instance anomalies (CCF) vs. peers To Do Unassigned Within 30 days 2 51000001002629 UNIFIED STRATEGIC | 2604290030005467 | Intermittent connection failure Issue No Platform(InternalOnly) 2026-05-05
37825752 [Detection] Availability Alert - Add per-RoleInstance ClientConnectionFailure (CCF) alert to detect single-VM degradatio... To Do Unassigned Within 14 days 2 51000001002629 UNIFIED STRATEGIC | 2604290030005467 | Intermittent connection failure Issue No Platform(InternalOnly) 2026-05-05
37758174 Rework tenant db migration process to accommodate multiple seperate release trains New Rafal Mielowski Within 60 days 2 771572160 Unable to Update service due to DB Script upgrade failing from 2.160.0 to 2.156.... Yes Backend 2026-04-30
37757611 DevPortal automated tests should have been able to intercept disabling basic auth prevented entra user creation To Do Roman Kolesnikov Within 30 days 2 788655236 Emerging Issue: Developer Portal User Registration Fails with error "User r... Yes Backend 2026-04-30
37757544 fix: registrationEnabled enforcement blocks Entra ID users from registering when basic auth is disabled In Review Unassigned Within 14 days 2 788655236 Emerging Issue: Developer Portal User Registration Fails with error "User r... Yes Backend 2026-04-30
36932768 Honor devportal settings when user registration is disabled In Review Ondrej Oprala Within 30 days 2 788655236 Emerging Issue: Developer Portal User Registration Fails with error "User r... Yes Backend 2026-04-30
37756455 Add regression guardrails for channel-aware version resolution in Undelete To Do Brian McAbee Within 14 days 2 771572160 Unable to Update service due to DB Script upgrade failing from 2.160.0 to 2.156.... Yes Backend 2026-04-30
37756454 Add negative test scenario coverage for channel/version mismatch and dependency failures To Do Brian McAbee Within 30 days 2 771572160 Unable to Update service due to DB Script upgrade failing from 2.160.0 to 2.156.... Yes Backend 2026-04-30
37756452 Create Undelete SLA alerting (success, latency, queue age) To Do Brian McAbee Within 14 days 2 771572160 Unable to Update service due to DB Script upgrade failing from 2.160.0 to 2.156.... Yes Backend 2026-04-30
37756453 Regional Alert for Undelete Orchestration failures (Service Health Repair) To Do Brian McAbee Within 30 days 2 771572160 Unable to Update service due to DB Script upgrade failing from 2.160.0 to 2.156.... Yes Backend 2026-04-30
37756451 Implement Undelete readiness gate before start of orchestration To Do Brian McAbee Within 30 days 2 771572160 Unable to Update service due to DB Script upgrade failing from 2.160.0 to 2.156.... Yes Backend 2026-04-30
37748959 Migrate DurableTask Hub to Azure Storage as Backend To Do Samir Solanki Within 60 days 2 771834088 [api-chn-prod-01-rp] API RP Orchestration Alert: BillingSkuV2 Orchestration has ... No Backend 2026-04-30
37748291 Implement automated recovery for stuck eternal orchestrations To Do Samir Solanki Within 30 days 2 771834088 [api-chn-prod-01-rp] API RP Orchestration Alert: BillingSkuV2 Orchestration has ... No Backend 2026-04-29
37748275 Add session ID casing investigation scenario to DurableTask troubleshooting docs To Do Samir Solanki Within 14 days 2 771834088 [api-chn-prod-01-rp] API RP Orchestration Alert: BillingSkuV2 Orchestration has ... No Backend 2026-04-29
37748270 Reduce BillingSkuV2 alert threshold from 24h to 4h To Do Unassigned Within 14 days 2 771834088 [api-chn-prod-01-rp] API RP Orchestration Alert: BillingSkuV2 Orchestration has ... No Backend 2026-04-29
37714528 Investigate why VM SKU fallback didn't help with activations Committed Srajan Agrawal Within 14 days 2 773367958 [api-usse5-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA ... No Capacity-Internal 2026-04-28
37704382 Alert/Monitor for consumption sku free certificate thumbprint cannot be found by App service(Service Health Report) To Do Nina Ren Within 30 days 2 772614131 [Emerging Issue] Consumption SKU services present the default *.azurewebsites.ne... Yes Backend 2026-04-27
37676908 Verify Responses with Large Header Do Not Cause Gateway Timeout In Review Mahsa Sadi Within 14 days 2 21000000991208 When calling an API with traffic flowing through a Self-Hosted Gateway, the requ... No Backend 2026-04-24
37660220 Integrate DNS zones in to RCM to monitor and reconcile quota state In Review Tom Kerkhove Within 30 days 2 780717094 Australia East activations can fail for Workspace & AOAI Hub gateway due to DNS ... No Backend 2026-04-23
37659372 Integrate [Public] eastus [v2] [SKUv1] Developer portal publishing SLA below 95% for at least 4 services in the last 2h ... To Do Unassigned Within 14 days 2 775833079 [Public] eastus [v2] [SKUv1] Developer portal publishing SLA below 95% for at le... Yes Backend 2026-04-23
37650758 Plug in brain to stop release rollout based on AOAI signal New Unassigned Within 30 days 2 772116432 Huge number of failures 408s and 500s for Content Safety Service while using API... No Backend 2026-04-22
37650755 Alert for 5xx and subset of 4xx based on backend & gateway response in AOAI Hub New Unassigned Within 30 days 2 772116432 Huge number of failures 408s and 500s for Content Safety Service while using API... No Backend 2026-04-22
37650745 Introduce capability to do reduced traffic for "new" scale units New Tom Kerkhove Within 30 days 2 772116432 Huge number of failures 408s and 500s for Content Safety Service while using API... No Backend 2026-04-22
37647916 Automatically pull bad scale units from rotation New Tom Kerkhove Within 30 days 2 772116432 Huge number of failures 408s and 500s for Content Safety Service while using API... No Backend 2026-04-22
37647913 Zero-touch by human for scale unit buildout Active Tom Kerkhove Within 30 days 2 772116432 Huge number of failures 408s and 500s for Content Safety Service while using API... No Backend 2026-04-22
37647891 Introduce second scale unit in every scale group scale group New Tom Kerkhove Within 30 days 2 772116432 Huge number of failures 408s and 500s for Content Safety Service while using API... No Backend 2026-04-22
37647759 Distribute scale units from same region across different tenant upgrade SDP stages New Tom Kerkhove Within 30 days 2 772116432 Huge number of failures 408s and 500s for Content Safety Service while using API... No Backend 2026-04-22
37647613 Introduce way to mirror traffic to sanity check new scale units before adding to rotation New Tom Kerkhove Within 30 days 2 772116432 Huge number of failures 408s and 500s for Content Safety Service while using API... No Backend 2026-04-22
37647257 Introduce alert to detect background refresh growth above treshhold New Nikita Govind Dhole Within 30 days 2 772116432 Huge number of failures 408s and 500s for Content Safety Service while using API... No Backend 2026-04-22
37646918 Event Table Connection string oin the Workspace Gateway is not getting upgraded New Unassigned Within 14 days 2 21000000989825 APIM Workspace gateway is not working - 2604170050003263 No Backend 2026-04-22
37642636 [Change Oracle] Unexplained No Correlation New Prudence Phillips Within 30 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2026-04-22
37631504 Block FreeTrial emails in multi-tenant EmailProcessor using ServiceName from blob payload In Review Samir Solanki Within 30 days 2 780345525 Emerging Issue: APIM is not sending Email notifications No Backend 2026-04-21
37611203 Add release notes for self-hosted gateway to support policy documentation To Do Unassigned Within 30 days 2 21000000991208 When calling an API with traffic flowing through a Self-Hosted Gateway, the requ... No Backend 2026-04-20
37594340 Add Throttling functionality based on Subscriptions Quota To Do Samir Solanki Within 14 days 2 780345525 Emerging Issue: APIM is not sending Email notifications No Backend 2026-04-18
37580153 Disable EMAIL functionality for Free Trial Subscription To Do Samir Solanki Within 14 days 2 780345525 Emerging Issue: APIM is not sending Email notifications No Backend 2026-04-17
37567190 Introduce alert to identify out of DNS records in a DNS zone to proactively request more quota In Review Tom Kerkhove Within 14 days 2 780717094 Australia East activations can fail for Workspace & AOAI Hub gateway due to DNS ... No Backend 2026-04-16
37566932 Introduce dedicated DNS zones for AOAI Hub (and AI Gateway) to isolate from other offerings such as Workspace gateway Active Tom Kerkhove Within 30 days 2 780717094 Australia East activations can fail for Workspace & AOAI Hub gateway due to DNS ... No Backend 2026-04-16
37565539 Distribute DNS entries across multiple regional DNS zones based on quota New Unassigned Within 30 days 2 780717094 Australia East activations can fail for Workspace & AOAI Hub gateway due to DNS ... No Backend 2026-04-16
37565532 Integrate DNS zones in to RCM to monitor and reconcile quota state New Unassigned Within 30 days 2 780717094 Australia East activations can fail for Workspace & AOAI Hub gateway due to DNS ... No Backend 2026-04-16
37565523 Introduce alert to identify low on DNS records in a DNS zone to proactively request more quota New Tom Kerkhove Within 30 days 2 780717094 Australia East activations can fail for Workspace & AOAI Hub gateway due to DNS ... No Backend 2026-04-16
37565021 Teach SRE agent how to assess increasing quota is safe, and how to do it New Tom Kerkhove Within 30 days 2 780451813 [Public] [AOAI Hub] Scale Group(s) Low on Gateway Quota - Production-Australia E... No Backend 2026-04-16
37564994 Configure SRE agent to be able to use AOAI Hub ACIS calls to get scale group/unit info New Tom Kerkhove Within 30 days 2 780451813 [Public] [AOAI Hub] Scale Group(s) Low on Gateway Quota - Production-Australia E... No Backend 2026-04-16
37564984 Log AIPlatformScaleGroupDetails after updating scale group to ensure alerts, SRE agent and dashboards instantly reflect ... New Ethan Lao Within 30 days 2 780451813 [Public] [AOAI Hub] Scale Group(s) Low on Gateway Quota - Production-Australia E... No Backend 2026-04-16
37527374 Log FirstPartyIPServiceTag for all code paths using beta feature New Unassigned Within 14 days 2 21000000983047 2604140030000963 | APIM custom domain certificate update failed due to AzureFirs... No Backend 2026-04-14
37527339 Introduce validation for AzureFirstPartyServiceTag to not allow on * scope (AzureFirstPartyServiceTagValidate) New Tom Kerkhove Within 14 days 2 21000000983047 2604140030000963 | APIM custom domain certificate update failed due to AzureFirs... No Backend 2026-04-14
37526292 Do not allow using AzureFirstPartyServiceTag on 3P services New Tom Kerkhove Within 30 days 2 21000000983047 2604140030000963 | APIM custom domain certificate update failed due to AzureFirs... No Backend 2026-04-14
37525767 Do not apply AzureFirstPartyServiceTag for all subscriptions during AOAI Hub buildout New Unassigned Within 14 days 2 21000000983047 2604140030000963 | APIM custom domain certificate update failed due to AzureFirs... No Backend 2026-04-14
37511380 Provide ACIS to quarantine all AOAI (Hub) services at once New Ethan Lao Within 30 days 2 777722043 Huge number of 500s (ExpressionValueValidationFailure on cache-value refresh-aft... Yes Backend 2026-04-13
37511341 Introduce dedicated release channel for AOAI (Hub) Active Tom Kerkhove Within 30 days 2 777722043 Huge number of 500s (ExpressionValueValidationFailure on cache-value refresh-aft... Yes Backend 2026-04-13
37511387 Introduce alerts to identify scale units reporting failures after version upgrade (platform errors, or surge in Expressi... New Nikita Govind Dhole Within 30 days 2 777722043 Huge number of 500s (ExpressionValueValidationFailure on cache-value refresh-aft... Yes Backend 2026-04-13
37509652 Need to improve Outage Declaration mechanisms to meet TTO/OutageDeclaration targets(Service Health Repair) To Do Unassigned Within 14 days 2 777722043 Huge number of 500s (ExpressionValueValidationFailure on cache-value refresh-aft... Yes Backend 2026-04-13
37509642 Need to improve Detection mechanisms to meet TTD/Detection targets To Do Unassigned Within 14 days 2 777722043 Huge number of 500s (ExpressionValueValidationFailure on cache-value refresh-aft... Yes Backend 2026-04-13
37281549 Sev2 activation failures in southafricawest and southeastus5--due to newly created infra sub Committed Kriti Majumdar Within 30 days 2 773367958 [api-usse5-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA ... No Capacity-Internal 2026-04-09
37466473 APIM lacks customer self-service options for gateway host mitigation (restart/reset); recommend implementing safe, audit... New Unassigned Within 60 days 2 51000000976163 TCP connection failed between on-prem to APIM service No Gateway 2026-04-09
37466169 APIM lacks sufficient telemetry and diagnostics to detect, alert, and explain gateway host failures; recommend enhanceme... New Unassigned Within 60 days 2 51000000976163 TCP connection failed between on-prem to APIM service No Gateway 2026-04-09
37402088 Billing Orchestration gets Fired for deleted services To Do Nina Ren Within 30 days 2 773123079 [Public] [All SKUs] Billing orchestration unhealthy for last 24 hrs. Mitigation ... No Backend 2026-04-03
37392514 Gateway must fail expressions with access to System.Environment type Active Maxim Kim Within 14 days 2 31000000575108 [MSRC] [111643] - Azure API Management - Azure Managed SQL Instance - UDR MitM o... No Platform(InternalOnly) 2026-04-03
37390791 Add support in Durable TaskFramework to be resilient to SessionId casing. To Do Samir Solanki Within 14 days 2 771834088 [api-chn-prod-01-rp] API RP Orchestration Alert: BillingSkuV2 Orchestration has ... No Backend 2026-04-03
37382729 22 active VMSS services have null DefaultIdentity — silently un-upgradeable, 5,763 VMSS deployment failures/week New Unassigned Within 30 days 2 51000000944047 UNIFIED | 2603110040012501 | Unable to reach Management Endpoint after applying ... No Backend 2026-04-02
37364727 Introduce alert for high connection acquisition time downstream New Tom Kerkhove Within 30 days 2 772116432 Huge number of failures 408s and 500s for Content Safety Service while using API... No Backend 2026-04-01
37351359 Remove variable declaration type validation In Review Ethan Lao Within 14 days 2 769545272 APIM Validation Fails when uploading API Policy No SMAPI 2026-03-31
37279730 Scale Up SouthEastAsia to P6 from P4 to handle the burst load To Do Unassigned Within 14 days 2 763459076 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-03-26
37279516 Optimize unnecessary writes for UserSubscription To Do Unassigned Within 14 days 2 763459076 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-03-26
37266410 Private Issuers changed to CCME New Unassigned 1 2 763708448 APIM Provisioning with SKU : Premium is failing in Delos Cloud Germany Central (... No BuildoutLoop 2026-03-25
37226382 Fix memory exhaustion on large number of APIs (10K+) and revisions (50K+) To Do Unassigned Within 60 days 2 21000000945361 APIM Capacity is running more than 90 % No Backend 2026-03-23
37198595 Gateway V2 CRI: DotNetty connection pool leases stale/closed channels causing UnexpectedDisconnectException (500s) New Unassigned Within 30 days 2 761489677 Cognitive Services APIM Gatewayv2 rollout causing unexpected increase in 500 err... No Backend 2026-03-20
37198594 Gateway V2 CRI: Add automatic retry on UnexpectedDisconnectException for backend connection failures New Unassigned Within 30 days 2 761489677 Cognitive Services APIM Gatewayv2 rollout causing unexpected increase in 500 err... No Backend 2026-03-20
37176718 Repairs for SSL Handshake Failures on GatewayV2 Gateway Endpoint In Review Branimir Giurov Within 14 days 2 760028728 Emerging Issue - Gatewayv2 - SSL Handhshake Failures on Gateway Endpoint No Backend 2026-03-20
37170414 Introduce alert to identify RP running out of connections for SQL To Do Tom Kerkhove Within 30 days 2 763459076 BRAIN detected an unusual trend in SLI "Success Rate" for APIManagemen... Yes Backend 2026-03-18
37100967 Service Upgrade vs Activations Scenario SQL AZ Fallback Consistency In Progress Srajan Agrawal Within 14 days 2 753777704 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2026-03-17
37132698 Integrate Service Monitor "[Public] More than 4 customer orchestrations failed due to lack " with Brain for Au... To Do Unassigned Within 30 days 2 707587308 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2026-03-16
37105616 Provide logging of failing TLS handshakes New Maxim Kim Within 30 days 2 760028728 Emerging Issue - Gatewayv2 - SSL Handhshake Failures on Gateway Endpoint No Backend 2026-03-13
37105044 Gateway V2 CRI: Requests with HOST header containing different IP than local port fail with 503, but v1 returns 404 In Review Joaquin Vano Within 14 days 2 760028728 Emerging Issue - Gatewayv2 - SSL Handhshake Failures on Gateway Endpoint No Backend 2026-03-13
37104396 Provide support for testing custom hostnames easily locally New Tom Kerkhove Within 30 days 2 760028728 Emerging Issue - Gatewayv2 - SSL Handhshake Failures on Gateway Endpoint No Backend 2026-03-13
36981624 Gateway V2 CRI: POST operations cause ClientConnectionFailure at transfer-response with response code 0 and backend resp... In Review Chun Ye Within 30 days 2 761523049 Emerging Issue - GatewayV2 - POST calls to backend hang No Backend 2026-03-13
37086753 Provide logging on HOST header used in ProxyRequest In Review Maxim Kim Within 14 days 2 760028728 Emerging Issue - Gatewayv2 - SSL Handhshake Failures on Gateway Endpoint No Backend 2026-03-12
37085956 [Repair-Item] Add TSG for when to declare an outage. To Do Unassigned Within 14 days 2 753777704 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2026-03-12
37085912 Investigate the reason why the existing alerts were not triggered during this period. Committed Neha Gupta Within 14 days 2 753777704 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2026-03-12
37085491 Improve this alert to early detect these failures using Runner subscription Committed Srajan Agrawal Within 14 days 2 753777704 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2026-03-12
37085630 Plan for AZ restriction To Do Neha Gupta Within 14 days 2 753777704 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2026-03-12
37074965 Regarding SQL Capacity, since these restrictions occur often, what actions should we take as a team? Should any communic... To Do Unassigned Within 30 days 2 753777704 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2026-03-11
37074946 What are the next steps for the UK South Region, given the significant VM quota constraints? To Do Unassigned Within 14 days 2 753777704 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2026-03-11
37042349 Skuv1[MachineStability]Onboard VMSS to ApplicationHealthExtension New Chukwuemeka Ojih Within 30 days 2 21000000915558 STRATEGIC | 2602230040009321 | APIM service returning 500 Internal Server Errors... No Backend 2026-03-11
37041781 Skuv1[MachineStability][Monitoring]Add automitigation when healthyVM from LB Status probe has no telemetry for > 20 m... New Chukwuemeka Ojih Within 30 days 2 21000000915558 STRATEGIC | 2602230040009321 | APIM service returning 500 Internal Server Errors... No Backend 2026-03-11
37071927 Auto-rollback customers facing SF resolution problems To Do Unassigned Within 14 days 2 760227783 Emerging Issue - GatewayV2 - Unable to connect to service fabric backend due to ... No Backend 2026-03-11
37071695 Introduce feature usage tracking of Service Fabric backends New Unassigned Within 30 days 2 760227783 Emerging Issue - GatewayV2 - Unable to connect to service fabric backend due to ... No Backend 2026-03-11
37071640 Re-introduce Service Fabric BVTs New Joaquin Vano Within 30 days 2 760227783 Emerging Issue - GatewayV2 - Unable to connect to service fabric backend due to ... No Backend 2026-03-11
36954238 Disallow using http-version="2" in forward-request when using SKU v2 or consumption In Review Tom Kerkhove Within 30 days 2 760227783 Emerging Issue - GatewayV2 - Unable to connect to service fabric backend due to ... No Backend 2026-03-11
37001452 Introduce TSG how we can disable HTTP/2 from platform side To Do Tom Kerkhove Within 14 days 2 760227783 Emerging Issue - GatewayV2 - Unable to connect to service fabric backend due to ... No Backend 2026-03-11
37069243 Auto-rollback slow HEADs (>25s) To Do Unassigned Within 14 days 2 760018381 Emerging Issue - GatewayV2 - HEAD Calls take too long or timeout No Backend 2026-03-11
37069239 Extend Gateway v2 Auto-Rollback for "Invalid non-ASCII or control character in header" To Do Unassigned Within 14 days 2 760018381 Emerging Issue - GatewayV2 - HEAD Calls take too long or timeout No Backend 2026-03-11
37069228 Auto-rollback for Proxy hostname + internal VNET To Do Unassigned Within 14 days 2 760028728 Emerging Issue - Gatewayv2 - SSL Handhshake Failures on Gateway Endpoint No Backend 2026-03-11
36881961 Gateway V2 CRI: Increased latency for HEAD requests In Review Maxim Kim Within 30 days 2 760018381 Emerging Issue - GatewayV2 - HEAD Calls take too long or timeout No Backend 2026-03-11
35993139 Gatewayv2: Breaking change when backend returns certain characters in header In Review Chun Ye Within 30 days 2 760146538 Emerging Issue - GatewayV2 - Gateway Calls Fail if header contains special chara... No Backend 2026-03-11
37068417 Gateway V2 CRI: Invalid/special characters in inbound request forces Gateway to return 400 (which is not logged) New Chun Ye Within 14 days 2 760146538 Emerging Issue - GatewayV2 - Gateway Calls Fail if header contains special chara... No Backend 2026-03-11
37054623 In api-blu-prod-scaleunit-001 database ever increase with entity events New Ondrej Oprala Within 30 days 2 756117435 [Public] eastus [v2] SKUv2 Customer Activation SuccessRate Below 95% SLA (with 3... No SMAPI 2026-03-10
37054597 Fix smapi and dataapi alerts to not include client disconnects To Do Rafal Mielowski Within 14 days 2 758343667 [Public] DataAPI-Australia East DataAPI success rate below 95% for a region in l... No SMAPI 2026-03-10
37028548 Fix null handling in MsiResourceManager.cs:210 New Nina Ren Within 60 days 2 753751475 [Public] Gateway is unable to acquire tokens for MI due to expired certificate (... No Platform(InternalOnly) 2026-03-08
37016965 Add details to Regional RP Deployment TSG to help with hotfixes To Do Brian McAbee Within 14 days 2 752344563 Emerging Issue: CRI:21000000915331 - SKUv2 service upgrade to 0.50 can change VN... No Backend 2026-03-06
37015084 Introduce automated rollout for tenant & custom settings on AOAI services in sovereign clouds New Tom Kerkhove Within 30 days 2 757384697 [Mooncake] | Connect to document intelligence via private endpoint failed Yes Backend 2026-03-06
37014811 Upgrade V2 should not block custom settings only upgrade when quarantined for minor version New Unassigned Within 14 days 2 757384697 [Mooncake] | Connect to document intelligence via private endpoint failed Yes Backend 2026-03-06
36763937 Introduce automated rollout for tenant & custom settings on AOAI services Active Tom Kerkhove Within 14 days 2 757384697 [Mooncake] | Connect to document intelligence via private endpoint failed Yes Backend 2026-03-06
37000136 UpdateTenantProvisioningAfterRestore SQL procedure times out for large services due to unindexed Revision scans and expo... New Unassigned Within 30 days 2 21000000904474 Unable to restore APIM service No Backend 2026-03-05
36991997 Add upgrade input validations for Skuv2 to determine correctness To Do Ajinkya Shendre Within 14 days 2 752344563 Emerging Issue: CRI:21000000915331 - SKUv2 service upgrade to 0.50 can change VN... No Backend 2026-03-04
36992167 Add Upgrade BVTs for Skuv2 Upgrades To Do Ajinkya Shendre Within 14 days 2 752344563 Emerging Issue: CRI:21000000915331 - SKUv2 service upgrade to 0.50 can change VN... No Backend 2026-03-04
36940693 Add upgrade validation to PE+VNet PremiumV2 integration test and use PremiumV2 as frontend APIM To Do Ajinkya Shendre Within 30 days 2 752344563 Emerging Issue: CRI:21000000915331 - SKUv2 service upgrade to 0.50 can change VN... No Backend 2026-03-03
36887594 Fix WebsiteResource.VnetIntegrationSubnetId property to always have correct subnet resource id. New Ajinkya Shendre Within 30 days 2 752344563 Emerging Issue: CRI:21000000915331 - SKUv2 service upgrade to 0.50 can change VN... No Backend 2026-03-03
36956497 Integrate Service Monitor "[Public] More than 4 customer orchestrations failed due to lack" with Brain for Aut... To Do Unassigned Within 14 days 2 753777704 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2026-03-02
36952197 Validate Billing Meters Accuracy and Enable Billing Orchestration in Bleu and Delos To Do Unassigned Within 14 days 2 753931769 [BleuFrance] - API Management - service in Bleu New Region Billing Meter Emissio... No Backend 2026-03-02
36912977 Integrate Service Monitor "b3e7404a-537a-4e1f-931a-9857cac6f299" with Brain for Auto Outage Detection (Service... New Unassigned Within 30 days 2 742424848 Azure API Management Activations Failing Across 8 public regions Yes Backend 2026-02-26
36912976 Integrate Service Monitor "[Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ un" with Brain for Au... New Unassigned Within 14 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2026-02-26
36914724 Fleet Diagnostics alert should not fire when retry succeeded To Do Unassigned Within 14 days 2 753660418 [Public] [VMSS][germanywestcentral] [VMSS][germanywestcentral] Fleet Diagnostic ... No Backend 2026-02-26
36906842 Investigate and fix the global 'Object reference not set' regression causing massive GatewayFailure errors since Feb 14-... To Do Unassigned Within 14 days 2 746587916 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... Yes Gateway 2026-02-26
36906822 Review and tune monitor title and BRAIN alerting logic to suppress or isolate noisy single-service issues (e.g., alat-pr... To Do Unassigned Within 14 days 2 746587916 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... Yes Gateway 2026-02-26
36896334 Introduce SRE agent to automatically assess why BRAIN declared SLI drop New Unassigned Within 30 days 2 752999523 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... No Backend 2026-02-25
36896314 Provide guidance how to rollback consumption site extension To Do Unassigned Within 30 days 2 752999523 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... No Backend 2026-02-25
36896193 Exception stacktrace is not logged for null reference in transfer-response nor available in ApiGatewayInfra New Unassigned Within 14 days 2 752999523 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... No Backend 2026-02-25
36896187 [Gateway] NRE in TransferHandler due to null CorrelationId header from DistributedTracingClientResponseHandler (0.50 reg... In Progress Nima Kamoosi Within 14 days 2 752999523 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... No Backend 2026-02-25
36895059 [Gateway] NRE in TransferHandler due to null CorrelationId header from DistributedTracingClientResponseHandler (0.50 reg... In Progress GitHub Copilot Within 14 days 2 752999523 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... No Backend 2026-02-25
36878388 Make the alert brain aware To Do Shilpa Mani Within 14 days 2 743669045 Endpoint throttler causes unexpected 503s when traffic is expected for customers... Yes Gateway 2026-02-24
36876992 Feature request: Anomaly detection correlation with version change, S500/1P special lower thresholds. New Unassigned Within 30 days 2 743669045 Endpoint throttler causes unexpected 503s when traffic is expected for customers... Yes Gateway 2026-02-24
36863862 Change SKU v2 SLA dashboard to exclude unsupported scenario to misreport SLA To Do Unassigned Within 14 days 2 752024466 [Public] SMAPI-germanywestcentral SMAPI success rate below 99.95% for v2 scale u... No SMAPI 2026-02-23
36863840 Change SKU v2 SLI to exclude unsupported scenario to avoid misfire In Review Tom Kerkhove Within 14 days 2 752024466 [Public] SMAPI-germanywestcentral SMAPI success rate below 99.95% for v2 scale u... No SMAPI 2026-02-23
36863790 Using Git features in SKU v2 should not return 500 (/tenant/configuration/syncState) New Unassigned Within 14 days 2 752024466 [Public] SMAPI-germanywestcentral SMAPI success rate below 99.95% for v2 scale u... No SMAPI 2026-02-23
36856305 Don't fire Fleet Diagnostic Failure Detected alert for < 5 services In Review Tom Kerkhove Within 14 days 2 751309226 [Public] [VMSS][centralindia] [VMSS][centralindia] Fleet Diagnostic Failure Dete... No ServicingLoop 2026-02-21
36846776 Improve platform health in ASI to show per region for multi-regional deployments To Do Unassigned Within 30 days 2 750329430 [Public] Impacted services (1) - [Gateway] PremiumSku-nonVNET- Gateway 100% not... No Platform(InternalOnly) 2026-02-20
36846753 Scaling operations in ASI are unreliable in multi-region scenario New Tom Kerkhove Within 30 days 2 750329430 [Public] Impacted services (1) - [Gateway] PremiumSku-nonVNET- Gateway 100% not... No Platform(InternalOnly) 2026-02-20
36830450 Provide detector to surface backend certificate information to customers to surface expired certs New Tom Kerkhove Within 60 days 2 21000000908139 Unable to reach endpoint hosted in azure aks from API management service No Backend 2026-02-19
36829466 Gateway V2 CRI: Memory leak kept crashing gateway New Dean Ward Within 30 days 2 747557452 [Public] australiaeast - SkuV1 - Premium - Impacted services (1) - Threshold (1)... No Gateway 2026-02-19
36770330 RCA EndpointThrottler Hitting the High Limit In Progress Mahsa Sadi Within 30 days 2 743669045 Endpoint throttler causes unexpected 503s when traffic is expected for customers... Yes Gateway 2026-02-18
36819323 Auto detect dependent outages - for livesite efficiency To Do Unassigned Within 14 days 2 740488024 [api-euapbn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SL... No Backend 2026-02-18
36819318 Terminate long running Activations to detect failures early To Do Chukwuemeka Ojih Within 30 days 2 740488024 [api-euapbn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SL... No Backend 2026-02-18
36804166 [Alerts/Monitoring] Detect long running or stuck Activations To Do Unassigned Within 14 days 2 740488024 [api-euapbn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SL... No Backend 2026-02-17
36800684 Do not auto-rollback Gateway v2 for services using HTTP/2 to backend and/or gRPC To Do Unassigned Within 14 days 2 748837199 APIM Scheduled Maintenance UK South Caused Service Outage for Dragon Copilot UK. Yes Backend 2026-02-17
36800676 Exclude Drago Copilot from auto-rollback alert To Do Unassigned Within 14 days 2 748837199 APIM Scheduled Maintenance UK South Caused Service Outage for Dragon Copilot UK. Yes Backend 2026-02-17
36797021 GC should backoff on seeing throttling failures on any resource deletion To Do Unassigned Within 30 days 2 740488024 [api-euapbn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SL... No Backend 2026-02-16
36797019 Storage account creation should have timeout for detecting Activation failures early To Do Unassigned Within 30 days 2 740488024 [api-euapbn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SL... No Backend 2026-02-16
36785830 JIT Policy for Key Vault Should Have Auto-Approve for Primary and Backup Backend DRI + IM To Do Raluca Constantina Popescu Within 14 days 2 748200494 [Publisher-Prod] Azure Key Vault--Key Vault Certificate Failed to Renew for: eus... No Backend 2026-02-15
36775804 Auto-rollback Gateway v2 for ProcessDyingTooFrequently In Review Tom Kerkhove Within 14 days 2 747590388 [Public] westus - SkuV1 - Premium - Impacted services (1) - Threshold (1) - [v2... No Gateway 2026-02-14
36765957 Do Not Throw "Free Certificate HTTP-Token cannot be retrieved by internal API" When Validation is Successful New Nina Ren Within 14 days 2 51000000899410 managed certificate expired in my APIM No Backend 2026-02-13
36617390 Investigate use of external cache for token exchange New Unassigned Within 60 days 2 736440943 [Public] APIHub [Kusto] ResourceProvider High Response Time - power-rp-europe002... No ServicingLoop 2026-02-12
36758388 Review Power Platform Redis Caches and scale as needed In Progress Michael Rowden Within 14 days 2 736440943 [Public] APIHub [Kusto] ResourceProvider High Response Time - power-rp-europe002... No ServicingLoop 2026-02-12
36719578 VM SKU fallback should retry only if the failure is due to capacity reasons To Do Unassigned Within 60 days 2 743072555 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2026-02-10
36683492 Allow increasing memory limit for cache beyond max integer To Do Unassigned Within 30 days 2 743593662 [Public] westus2 - SkuV1 - Premium - Impacted services (1) - Threshold (1) - [v... No Backend 2026-02-06
36682473 Time to detect high, need improvement in Detection New Unassigned Within 14 days 2 743669045 Endpoint throttler causes unexpected 503s when traffic is expected for customers... Yes Gateway 2026-02-06
36678777 Properly handle invalid integer when reading from settings To Do Unassigned Within 30 days 2 743593662 [Public] westus2 - SkuV1 - Premium - Impacted services (1) - Threshold (1) - [v... No Backend 2026-02-06
36675398 Logging improvements for efficient RCAs in VM Sku fallback To Do Shubham Sharma (DevDiv) Within 14 days 2 743072555 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2026-02-06
36675395 Fix RP Activation SLA dashboard to exclude noise from Invalid input failures To Do Shilpa Mani Within 14 days 2 743072555 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2026-02-06
36675394 Fix Activation failures LSI monitor to avoid noise To Do Shilpa Mani Within 14 days 2 743072555 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2026-02-06
36654248 Activation failing in FF and MC after G2 Issuer launch New Samir Solanki Within 14 days 2 743290895 [Fairfax] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (with 6+ occu... No Backend 2026-02-05
36629594 Get rid of g.portal-editor.azure-api.net endpoint To Do Roman Kolesnikov (APIM) Within 30 days 2 742547046 [Public] [Stage_7] [v2] [SKUv2] Developer Portal Editor endpoint SLA below 97% f... No Backend 2026-02-03
36628151 Re-enable in East Asia To Do Tom Kerkhove Within 30 days 2 742528968 [Public] Fleet Diagnostic Failure Detected in East Asia No Backend 2026-02-03
36627637 Proper handle invalid request: /upload endpoint shouldn't fail with 500 when invalid input is sent New Roman Kolesnikov (APIM) Within 30 days 2 742547455 [Public] West Europe [v2] [SKUv2] Developer Portal Editor regional app SLA below... No Backend 2026-02-03
36626223 Run Fleet Diagnostics alert hourly, not daily In Review Tom Kerkhove Within 14 days 2 742528968 [Public] Fleet Diagnostic Failure Detected in East Asia No Backend 2026-02-03
36626217 Regionalize Fleet Diagnostics alert To Do Tom Kerkhove Within 14 days 2 742528968 [Public] Fleet Diagnostic Failure Detected in East Asia No Backend 2026-02-03
35599221 Gateway V2 CRI: SymmetricAlgorithm.Create() is not supported in .NET 8 In Review Tom Kerkhove Within 30 days 2 51000000879801 Disable GatewayV2 No Backend 2026-02-02
36610543 RP Regional Cluster database at more than {thresholdCpu}% for more than {thresholdCount} datapoints in past 6 hour - {Cl... New Unassigned Within 30 days 2 735213317 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... No Backend 2026-02-02
36610524 Update alert to fire only for customer orchestrations: More than 4 customer orchestrations failed due to lack of AZ supp... Active Srajan Agrawal Within 30 days 2 735213317 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... No Backend 2026-02-02
36569400 Introduce Sev2.5 alert to identify MI certificate refresh issues, with less than 7 days before expiry New Unassigned Within 30 days 2 737429942 [Public] Gateway is unable to acquire tokens for MI due to expired certificate (... No Backend 2026-01-28
36567234 Create new alert in capacity queue for quota failures To Do Neha Gupta Within 14 days 2 735146216 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Capacity-Internal 2026-01-28
36567219 Data logger for Resource Pool table should reflect granular changes in RP table To Do Unassigned Within 30 days 2 735146216 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Capacity-Internal 2026-01-28
36567216 Capacity alerts should calculate buffer considering the runner activation throughputs To Do Unassigned Within 30 days 2 735146216 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Capacity-Internal 2026-01-28
36566642 EventHub send timeout should apply to single publisher.SendAsync call instead of multiple New Ansul Goenka Within 14 days 2 739261853 Nearly all billing events being lost in transit between APIM and cogsvc billing ... No Backend 2026-01-28
36552787 RCM VM SKU restrictions job should run at smaller intervals To Do Unassigned Within 30 days 2 733408111 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Platform(InternalOnly) 2026-01-27
36552777 SRE Agent: monitor activation failures and auto-mitigate when over To Do Shubham Sharma (DevDiv) Within 14 days 2 733408111 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Platform(InternalOnly) 2026-01-27
36544906 Tenant SSL expiration alert missing New Gleb Feoktistov Within 14 days 2 739033506 Sophia EUS Containers are not reachable due to some cert error No Backend 2026-01-26
36528976 Fix alerts for SMAPI to exclude HttpIncomingRequest event from calculation New Rafal Mielowski Within 14 days 2 738445134 [Public] SMAPI-eastus2euap SMAPI success rate below 99.95% for v2 scale unit in ... No Backend 2026-01-24
36525996 Retrospective 36493140 - Examine logic LSI alert thresholds To Do Michael Rowden Within 14 days 2 731380101 APIHub [HealthMonitor] TokenExchange High Response Time for power-te-europe002-w... No ServicingLoop 2026-01-23
36504493 Add TSG how to query MI logs To Do Unassigned Within 30 days 2 737429942 [Public] Gateway is unable to acquire tokens for MI due to expired certificate (... No Backend 2026-01-22
36504476 Refresh MI ACIS should fail, if the update of certificates failed New Unassigned Within 30 days 2 737429942 [Public] Gateway is unable to acquire tokens for MI due to expired certificate (... No Backend 2026-01-22
36503803 Provide ACIS to update secret URL for MI cert (admin only) New Unassigned Within 30 days 2 737429942 [Public] Gateway is unable to acquire tokens for MI due to expired certificate (... No Backend 2026-01-22
36503137 Failing to renew a single MI certificate should not block all cert renewals New Unassigned Within 30 days 2 737429942 [Public] Gateway is unable to acquire tokens for MI due to expired certificate (... No Backend 2026-01-22
36500777 Index out of bound when decoding SAS token in data api New Rafal Mielowski Within 30 days 2 737382017 [Public] DataAPI-prodm2wbapim.azure-api.net DataAPI success rate below 98% for s... No Backend 2026-01-22
36500771 Data api dashboards for issues detection New Rafal Mielowski Within 14 days 2 737382017 [Public] DataAPI-prodm2wbapim.azure-api.net DataAPI success rate below 98% for s... No Backend 2026-01-22
36493140 Retrospective 1327736 - Add LSI alerts to detect Redis capacity issues and Redis timeouts To Do Bruce Moe Within 14 days 2 731380101 APIHub [HealthMonitor] TokenExchange High Response Time for power-te-europe002-w... No ServicingLoop 2026-01-21
36486123 Close Dav4 resource pools that have VM SKU restrictions To Do Neha Gupta Within 14 days 2 736619322 [MoonCake] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (with 6+ occ... No Backend 2026-01-21
36482976 node js vulnerability in node v 22.16 New Roman Kolesnikov Within 30 days 2 731700059 [Publisher-Prod] SHAIntel--Immediate Action Required — Remediate Past-SLA Vulner... No Backend 2026-01-21
36476900 Retrospective 1327736 - Introduce layered caching and circuit breakers around Redis calls in ApiHub to reduce direct dep... To Do Bruce Moe Within 60 days 2 731380101 APIHub [HealthMonitor] TokenExchange High Response Time for power-te-europe002-w... No ServicingLoop 2026-01-20
36476876 Retrospective 1327736 - Update compute/storage failover TSGs To Do Bruce Moe Within 14 days 2 731380101 APIHub [HealthMonitor] TokenExchange High Response Time for power-te-europe002-w... No ServicingLoop 2026-01-20
36476855 Retrospective 1327736 - Implement automatic compute failover thresholds adjustment for ApiHub traffic manager to enable ... To Do Bruce Moe Within 14 days 2 731380101 APIHub [HealthMonitor] TokenExchange High Response Time for power-te-europe002-w... No ServicingLoop 2026-01-20
36476788 Permanently scale Redis instance in power-europe002 to P5 SKU To Do Bruce Moe Within 14 days 2 731380101 APIHub [HealthMonitor] TokenExchange High Response Time for power-te-europe002-w... No ServicingLoop 2026-01-20
36473840 Dashboard for Investigating Regional SLI Success Rate Drops for Gateway New Unassigned Within 14 days 2 731443403 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... Yes Backend 2026-01-20
36463031 Introduce alert to detect certificate installation failures New Unassigned Within 14 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2026-01-19
36451702 SRE Agent- Force Upgrade for Authentication Resource Not Assigned Failures New Chukwuemeka Ojih Within 30 days 2 734421301 [Public] [Aggregate] Dev SKU services that are down for more than 1 hour No Backend 2026-01-16
36440554 Bootstrapper should block VM startup if Geneva Ingestion endpoint is unreacheable New Chukwuemeka Ojih Within 30 days 2 21000000812954 2512110040004303 | Surge in 401 requests seems to be related to a faulty machine No Backend 2026-01-15
36434455 FailedToAcquireMsalToken due to expired certificate should be logged as error / gateway error Active Tom Kerkhove Within 30 days 2 734080311 ApiManagement Service with MSI configured may see their Runtime calls fail with ... No Backend 2026-01-15
36434368 Use SRE Agent to fully automate handling of expired certificate, force refresh and monitor mitigation New Tom Kerkhove Within 30 days 2 734080311 ApiManagement Service with MSI configured may see their Runtime calls fail with ... No Backend 2026-01-15
36434288 Update RefreshMsiCredentialGettingNewCredentialsSkipped log entry to include NotBefore To Do Maxim Agapov Within 14 days 2 734080311 ApiManagement Service with MSI configured may see their Runtime calls fail with ... No Backend 2026-01-15
36433723 Provide dedicated columns for NotBefore & NotAfter of auth certificate in GatewayEntraIdTokens for ease of use New Tom Kerkhove Within 30 days 2 734080311 ApiManagement Service with MSI configured may see their Runtime calls fail with ... No Backend 2026-01-15
36433709 Introduce Sev2 alert to identify regressions in MI token acquisition after rolling out a new version New Tom Kerkhove Within 60 days 2 734053868 [EastUS2EUAP] MsiTokenCredentialProviderErrorObtainingToken error in  0.50.25943... No Backend 2026-01-15
36433693 Acquiring MI token causes a null reference New Tom Kerkhove Within 14 days 2 734053868 [EastUS2EUAP] MsiTokenCredentialProviderErrorObtainingToken error in  0.50.25943... No Backend 2026-01-15
36431386 Refresh MSI Orchestration not renewing the new Credentials generated for Inverted Umbrella In Review Maxim Agapov Within 14 days 2 734080311 ApiManagement Service with MSI configured may see their Runtime calls fail with ... No Backend 2026-01-15
36430015 Introduce Sev2 alert for data-plane issues due to expired MI certificate New Tom Kerkhove Within 30 days 2 734080311 ApiManagement Service with MSI configured may see their Runtime calls fail with ... No Backend 2026-01-14
25298069 FailedToProcessRequest: DotNetty.Codecs.EncoderException - InvalidOperationException: unexpected message type: DefaultFu... New Dean Ward Within 30 days 2 733476504 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... No Backend 2026-01-14
36374966 [Docs] Update documentation to have a clear and up to date steps to stop the ongoing release To Do Unassigned Within 14 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2026-01-09
36374870 [Follow-up] Research possibility of utilizing SLB logs to detect an increased number of connection attempts To Do Unassigned Within 14 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2026-01-09
36374848 [Follow-up] Research onboarding on BRAIN detection for traffic volume changes To Do Unassigned Within 14 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2026-01-09
36374834 [Bootstrapper] Log all SSL certificates chain validation results Committed Gleb Feoktistov Within 14 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2026-01-09
36358999 Gateway V2 CRI: Ensure we have request queue metric in Gateway v2 New Dean Ward Within 30 days 2 51000000825069 APIM capacity reaching 99% No Backend 2026-01-08
31050643 Gateway V2: Investigate if dropped requests due to full queue are logged in ProxyRequest New Dean Ward Within 30 days 2 51000000825069 APIM capacity reaching 99% No Backend 2026-01-08
35980655 Log outbound request queue time in GatewayOutgoingRequests New Dean Ward Within 30 days 2 51000000825069 APIM capacity reaching 99% No Backend 2026-01-08
36358804 Gateway V2: GatewayOutgoingRequests connectionId is not populated New Tom Kerkhove Within 30 days 2 51000000825069 APIM capacity reaching 99% No Backend 2026-01-08
36358805 Gateway V2: GatewayConnectionStats TcpConnectionEstablished event was logged twice New Tom Kerkhove Within 30 days 2 51000000825069 APIM capacity reaching 99% No Backend 2026-01-08
36358807 Gateway V2: Queue time is not logged in GatewayOutgoingRequest New Dean Ward Within 30 days 2 51000000825069 APIM capacity reaching 99% No Backend 2026-01-08
36358809 Gateway V2: Outbound connection limit is hardcode to 1024 New Dean Ward Within 30 days 2 51000000825069 APIM capacity reaching 99% No Backend 2026-01-08
36357942 Extend get-authorization-context TSG to help identify if it's customer of platform issue New Unassigned Within 60 days 2 731443403 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... Yes Backend 2026-01-08
36356578 Set up alert to detect "Specified padding mode is not valid" failures New Unassigned Within 14 days 2 721806070 [GitHub] Elevated 500s in Australia East Deployments No Gateway 2026-01-08
36356468 Introduce TSG with steps to gather info for certificate padding issue To Do Tom Kerkhove Within 14 days 2 721806070 [GitHub] Elevated 500s in Australia East Deployments No Gateway 2026-01-08
36356241 Improve error handling for token acquisition (authentication-managed-identity and others) New Tom Kerkhove Within 30 days 2 721806070 [GitHub] Elevated 500s in Australia East Deployments No Gateway 2026-01-08
36347095 RequestToBackendFailed does not log the effective exception New Unassigned Within 14 days 2 721806070 [GitHub] Elevated 500s in Australia East Deployments No Gateway 2026-01-07
36189472 Ensure we log MSAL exceptions from Managed Identity policy in ProxyRequest In Review Tom Kerkhove Within 30 days 2 721806070 [GitHub] Elevated 500s in Australia East Deployments No Gateway 2026-01-07
36344342 Introduce admin-level ACIS to acquire MI certificate for tenant New Tom Kerkhove Within 30 days 2 721806070 [GitHub] Elevated 500s in Australia East Deployments No Gateway 2026-01-07
36320485 RequestToBackendFailed events are not logged in context of a request In Review Tom Kerkhove Within 30 days 2 721806070 [GitHub] Elevated 500s in Australia East Deployments No Gateway 2026-01-05
36295916 [GWv2] Create an alert for client TLS handshake issues To Do Unassigned Within 30 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2025-12-30
36295909 [Test] Add custom hostnames integration tests for basic scenario To Do Unassigned Within 30 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2025-12-30
36264589 [RP] GetAzureRestMessageExternalError to handle this type of errors so customers are aware of the invalid input To Do Unassigned Within 30 days 2 725935306 [Public] eastus2 [v2] SKUv2 Customer Activation SuccessRate Below 95% SLA (with ... No Backend 2025-12-22
36264524 [Bootstrapper] Ensure intermediate and root certificates from older hostname configurations are not removed if still in ... New Unassigned Within 14 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2025-12-22
36264303 [GWv2] Research a possibility of logging connectivity issues during TLS handshake To Do Unassigned Within 30 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2025-12-22
36255935 Author BRAIN SLI to detect APIM outage through BRAIN. This ICM was manually created and very late. TTO SLA should be <... New Unassigned Within 14 days 2 724861901 APIM Dev SKU Custom Domain SSL certificate installation failures 0.50 Release Yes Backend 2025-12-22
36173724 Env. vars for SQL datasets broken, not replacing dataset after renaming stored procedure entity New Jesus Lopez Felix 3 2 21000000810946 [CRI] [Premier] Environment variable database connection errors in Power Apps No Backend 2025-12-19
36245084 Extend GetAuthTokenCompleted with certificate used information New Tom Kerkhove Within 30 days 2 721806070 [GitHub] Elevated 500s in Australia East Deployments No Gateway 2025-12-19
36244757 Extend (MI) certificate auth logging to include extensions on the certificate In Review Tom Kerkhove Within 30 days 2 721806070 [GitHub] Elevated 500s in Australia East Deployments No Gateway 2025-12-19
36243192 [Self-heal] Rebooting VMSS when proxy is not starting up due to resource exhaustion To Do Chukwuemeka Ojih Within 60 days 2 51000000784593 apim est down No ServicingLoop 2025-12-19
36240019 dashboard(dev-portal) Add dashboard showing affected services and aggregate errors by Dev Portal activation failures To Do Unassigned Within 30 days 2 720988238 [Public] Japan East [v2] [SKUv2] Developer Portal: Activation Failure Rate above... No Backend 2025-12-18
36239877 fix(gw) Update CPU capacity metric to reflect Total CPU percentage more closely New Ansul Goenka Within 30 days 2 722468996 [Mooncake][21V] - [APIM][Azul_SR_20251214387055][No MSSolve Case][Customer‘s API... No Backend 2025-12-18
36236060 Auto-open pool exhaustion LSI as Sev2 when below 20%, Sev3 for 40% (% based on region & pool size) To Do Dan Chartier Within 30 days 2 714425149 SKUv2 Customer Activation SuccessRate Below 95% SLA for non-pre-pooled services ... Yes Backend 2025-12-18
36236057 Change Dashboards / alerts to use OchestrationKpi New Unassigned Within 30 days 2 714425149 SKUv2 Customer Activation SuccessRate Below 95% SLA for non-pre-pooled services ... Yes Backend 2025-12-18
36236049 Increase pool size for SKU v2 to 50 To Do Dan Chartier Within 30 days 2 714425149 SKUv2 Customer Activation SuccessRate Below 95% SLA for non-pre-pooled services ... Yes Backend 2025-12-18
36235876 fix(gw) Fix the high CPU state caused by previous NRE fix New Unassigned Within 30 days 2 723423460 PARENT [prod] API RP Alert: HealthMonitorRegionalResourceProviderNotReachable oc... No Backend 2025-12-18
36227766 Investigate why CNAME updates are needed for APIM service termination To Do Ajinkya Shendre Within 30 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2025-12-17
36227713 Add a global Activation/Update/Terminate SLA dashboard for SkuV1/Consumption Sku To Do Ajinkya Shendre Within 30 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2025-12-17
36227460 Introduce GatewayOutgoingRequests for token acquisition New Tom Kerkhove Within 60 days 2 721806070 [GitHub] Elevated 500s in Australia East Deployments No Gateway 2025-12-17
14578048 [RP] Tenant Certificate Unexpectedly Removed from Machine New Gleb Feoktistov Within 60 days 2 21000000819283 2512170030002686 | Application Gateway failing to connect to APIM after upgrade ... No Backend 2025-12-17
36221474 Inspect types used in typeof expression for available types New Rafal Mielowski Within 60 days 2 31000000513411 [MSRC] [104429] - RemoteCodeExecution - API Management Services - Deserialisatio... No Backend 2025-12-17
36221468 Restrict all Transform methods from XslCompiledTransform New Rafal Mielowski Within 14 days 2 31000000513411 [MSRC] [104429] - RemoteCodeExecution - API Management Services - Deserialisatio... No Backend 2025-12-17
31677449 Stop tracking Azure KeyVault in Resource Pool New Kriti Majumdar Within 60 days 2 720410333 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2025-12-17
36195155 Gateway should skip duplicate backends when being added to a pool New Mahsa Sadi Within 30 days 2 721423383 [Public] westus2 (1) - [Gateway]: Event processing SLA below 99.5% with over 4 ... No Gateway 2025-12-15
36189864 alert(platform) Update alert threshold (from 6) so it fires under conditions of major synthetic failures New Unassigned Within 30 days 2 722324504 [Fairfax] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (with 6+ occu... No Backend 2025-12-14
36189433 fix(gw) Disable Cert Padding Check RegKey config setting New Nima Kamoosi Within 30 days 2 21000000813688 "Specified padding mode is not valid for this algorithm" errors observ... No Backend 2025-12-13
36178734 Backend pools with same backend referenced multiple times should be blocked New Mahsa Sadi Within 30 days 2 721423383 [Public] westus2 (1) - [Gateway]: Event processing SLA below 99.5% with over 4 ... No Gateway 2025-12-12
36169915 Identify services which have compute not-aligned with their Sku and bring them in sync New Shilpa Mani Within 30 days 2 715853981 [Public] centralus - SkuV1 - Premium - Impacted services (1) - Threshold (1) - ... No Backend 2025-12-11
36169402 BRAIN SkuV2 Activation Success Rate - move to Sev2 with auto comms To Do Branimir Giurov Within 30 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2025-12-11
36168160 Finetune Fairfax alerts so that they fire when SLA is consistently at 0. To Do Shilpa Mani Within 14 days 2 720410333 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2025-12-11
36168142 Alerts for api-bn1-prod-01 capture metrics from both Prod East US 2 and Fairfax USGov Virginia New Shilpa Mani Within 14 days 2 720410333 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2025-12-11
36162548 Add ability to change daily SLA dashboard to support hourly SLA To Do Ajinkya Shendre Within 30 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2025-12-11
36149539 Close resource pools in shared susbcriptions in Fairfax Committed Shubham Sharma (DevDiv) Within 14 days 2 720410333 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2025-12-10
36149533 Release RCM ACIS action in N-clouds To Do Unassigned Within 14 days 2 720410333 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2025-12-10
36149524 Skip creating compute resource pool in region if KeyVault is not available New Unassigned Within 14 days 2 720410333 [api-bn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2025-12-10
36103335 [SMAPI] Oauth2 clientSecret, resourceOwnerUsername and resourceOwnerPassword is exposed when exporting API In Review Unassigned Within 60 days 2 31000000507849 [MSRC] [104150] - Azure - API Management Leaks Configured OAuth2 and OpenID Cred... No Backend 2025-12-05
36092610 Investigate BRAIN healthchecks and how to integrate them in APIM BRAIN alerts To Do Branimir Giurov Within 14 days 2 704152427 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... Yes Backend 2025-12-04
36091919 "Unable to connect to the remote server" message when logging GatewayFailure internal event should be logged a... To Do Nima Kamoosi Within 30 days 2 704152427 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... Yes Backend 2025-12-04
36091841 Dashboard: BRAIN Gateway SuccessRate impact (regional) To Do Branimir Giurov Within 14 days 2 704152427 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... Yes Backend 2025-12-04
36091833 TSG: BRAIN Gateway SuccessRate drop - add section explaining that the impact is for both all Skus including Consumption ... To Do Branimir Giurov Within 14 days 2 704152427 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... Yes Backend 2025-12-04
36068047 Repair Item: Clean up leaked DNS records for SkuV2 PrePro services which do not exist To Do Unassigned Within 30 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2025-12-03
36068344 Use a different DNZ zone for Customer APIMs vs SKUv2 Prepro APIMs To Do Unassigned Within 60 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2025-12-03
36068272 Regionalize the DNS Zones so that each zone contains records for only 1 region To Do Unassigned Within 60 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2025-12-03
36068532 Add a dashboard to add visibility for current number of DNS records per zone. To Do Unassigned Within 14 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2025-12-03
36068472 Add alerting for DNS Zone records limit for all the zones To Do Unassigned Within 14 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2025-12-03
36068760 Scale Unit Health Monitoring New Ajinkya Shendre Within 30 days 2 690759450 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-12-02
36067843 Add Outage criteria for wide-spread activation failures To Do Unassigned Within 14 days 2 714425149 SKUv2 Customer Activation SuccessRate Below 95% SLA for non-pre-pooled services ... Yes Backend 2025-12-02
36058618 SKUv2 preprovisioned service CNAME is not cleaned up on activation. To Do Dan Chartier Within 14 days 2 716137768 [Public] [v2] SKUv2 Update SuccessRate Below 95% SLA (with 3+ unique service/sub... Yes Backend 2025-12-01
36056499 Publish customer-facing guidance how to front NSP-secured resources with APIM Active Tom Kerkhove Within 30 days 2 21000000792762 "error": "Failed to upload file. Status: 403" | Case: 251119... No Backend 2025-12-01
36056490 Verify Network Security Perimeter can protect resource that APIM can connect to New Unassigned Within 14 days 2 21000000792762 "error": "Failed to upload file. Status: 403" | Case: 251119... No Backend 2025-12-01
36055992 Support for internal detectors in Bleu New Unassigned Within 60 days 2 713696610 [Bleu] Error when deploying APIM instance to Bleu No Backend 2025-12-01
36018288 Fix IndexOutOfRange Exception in Weighted Distributor In Progress Mahsa Sadi Within 30 days 2 710376775 [Public] eastus-SkuV1 (Er:12 x Svc:11) - [v2][Gateway]: 2+ new Gateway Requests... No Backend 2025-11-27
36028267 Improve Brain Coverage by integrating Service Monitor: [Public] SKUv2 Customer Activation SuccessRate Below 95% SLA (Se... To Do Shilpa Mani 1 2 714425149 SKUv2 Customer Activation SuccessRate Below 95% SLA for non-pre-pooled services ... Yes Backend 2025-11-26
36023658 Provide TSG with common SKU v2 provisioning problems and include "user not authorized" scenario To Do Tom Kerkhove Within 14 days 2 21000000793570 APIM Service Activation Issue due to lack of permissions to join subnet  No Backend 2025-11-26
36023655 Ensure customer has Microsoft.Web enabled on subscription during SKU v2 + VNET scenarios Active Ajinkya Shendre Within 30 days 2 21000000793570 APIM Service Activation Issue due to lack of permissions to join subnet  No Backend 2025-11-26
36023284 Document DNS endpoints for Bleu in network reference New Unassigned Within 14 days 2 713696610 [Bleu] Error when deploying APIM instance to Bleu No Backend 2025-11-26
35871847 Sync up with SQL Team and figure out the action item to whitelist our subscriptions to allow AZ By Default In Progress Saikiran Vukyam Within 30 days 2 707587308 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2025-11-24
35871844 SQL AZ Fallback Logic in the Activation Path In Review Srajan Agrawal Within 14 days 2 707587308 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2025-11-24
36001920 Move SKU v2 Sev3 LSIs from Backend to Platform loop To Do Dan Chartier Within 30 days 2 714425149 SKUv2 Customer Activation SuccessRate Below 95% SLA for non-pre-pooled services ... Yes Backend 2025-11-24
36001802 Provide TSG how to assess state of SKU v2 pre-pooling To Do Dan Chartier Within 30 days 2 714425149 SKUv2 Customer Activation SuccessRate Below 95% SLA for non-pre-pooled services ... Yes Backend 2025-11-24
35996657 No ability to query deployment history after activation due to RG cleaned up New Shilpa Mani Within 30 days 2 714425149 SKUv2 Customer Activation SuccessRate Below 95% SLA for non-pre-pooled services ... Yes Backend 2025-11-23
35996656 Activation did not surface 429s from Antares when trying to perform slot operation New Shilpa Mani Within 30 days 2 714425149 SKUv2 Customer Activation SuccessRate Below 95% SLA for non-pre-pooled services ... Yes Backend 2025-11-23
35996546 Extend CreateOrUpdateResourceGroup* & WaitForTemplateDeployment* logging to include subscription ID In Review Tom Kerkhove Within 14 days 2 714425149 SKUv2 Customer Activation SuccessRate Below 95% SLA for non-pre-pooled services ... Yes Backend 2025-11-23
35987143 Logs are gone in Bleu cloud for regional RPs New Brian McAbee Within 14 days 2 713696610 [Bleu] Error when deploying APIM instance to Bleu No Backend 2025-11-21
35974349 Introduce BVT scenario for connection closure validation Active Dean Ward Within 60 days 2 707289075 [Missing Info] [GitHub Copilot]: error rates above SLO in multiple Azure regions... No Backend 2025-11-20
35974347 Promote 503 LSI to Sev2, instead of email, and consider fleet-wide rather than 1P Active Tom Kerkhove Within 30 days 2 707289075 [Missing Info] [GitHub Copilot]: error rates above SLO in multiple Azure regions... No Backend 2025-11-20
35974303 Gateway does not properly send connection closure requests Active Tom Kerkhove Within 14 days 2 707289075 [Missing Info] [GitHub Copilot]: error rates above SLO in multiple Azure regions... No Backend 2025-11-20
35950140 Introduce TSG with queries to identify requests were drained properly To Do Unassigned Within 60 days 2 707289075 [Missing Info] [GitHub Copilot]: error rates above SLO in multiple Azure regions... No Backend 2025-11-18
35950127 Introduce TcpConnectionClosed logs in GatewayConnectionStats, if possible New Unassigned Within 30 days 2 707289075 [Missing Info] [GitHub Copilot]: error rates above SLO in multiple Azure regions... No Backend 2025-11-18
35942569 AOAI Service Specific Status Code Requirements (Service Health Repair) New Unassigned Within 14 days 2 695755987 [CSS] -AOAI-Reliability [SAP] [Lantency anomaly boost, number of anomalies 408] Yes Backend 2025-11-17
35926669 Create a Comprehensive Regional Outage Dashboard covering all the Components and SKUs SLA in ASI To Do Unassigned Within 30 days 2 690759450 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-15
35925202 Add Multi-Tenant Consumption Sku dashboard to Regional Outage Dashboard To Do Unassigned Within 30 days 2 690759450 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-14
35925211 Add Scale units SLA to Regional Outage Dashboard To Do Ajinkya Shendre Within 14 days 2 690759450 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-14
35920317 Introduce Min/Max aggregation for active request metric New Unassigned Within 30 days 2 707289075 [Missing Info] [GitHub Copilot]: error rates above SLO in multiple Azure regions... No Backend 2025-11-14
35910574 Explore if SRE Agent can be utilized for Regional Outage Scenarios To Do Unassigned Within 30 days 2 690759450 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-13
35909680 Onboard brain alerts for Premium Sku service with internal vnet To Do Branimir Giurov Within 30 days 2 690759450 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-13
35909631 Fix brain alert for Premium sku Proxy unreachable To Do Branimir Giurov Within 30 days 2 690759450 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-13
35909542 Build dashboards which can be used for determining regional outage impact To Do Ajinkya Shendre Within 30 days 2 690759450 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-13
35909812 Enable brain auto comms for outage scenarios To Do Branimir Giurov Within 30 days 2 690759450 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-13
35909787 Review sql recommendations from most utilized regions/scaleunit for smapi db New Rafal Mielowski Within 60 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-13
35909751 SRE agent integration for checking dependencies of our infra New Macko Treder Within 60 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-13
35909334 Move regional beta features for RP out of DB New Unassigned Within 60 days 2 698451492 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-13
35908607 Review MCS coockie usage and disable / isolate if possible New Unassigned 3 2 21000000754738 [Strategic] All Users are getting an error message every time when interacting w... No ServicingLoop 2025-11-13
35907902 Introduce column in ProxyRequest indicating connection closure was requested New Tom Kerkhove Within 30 days 2 707289075 [Missing Info] [GitHub Copilot]: error rates above SLO in multiple Azure regions... No Backend 2025-11-13
35907900 Enable active request counter metric by default In Review Tom Kerkhove Within 14 days 2 707289075 [Missing Info] [GitHub Copilot]: error rates above SLO in multiple Azure regions... No Backend 2025-11-13
35906643 Improve process of enabling beta feature to include RP version check To Do Unassigned Within 60 days 2 698451492 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-13
35884968 MonAgent: Fix Azure Profiler snapshotting In Review Dean Ward Within 30 days 2 707998003 AOAI Copilot services have increased gateway overhead after enabling Gateway v2 No Backend 2025-11-12
35889999 Gateway V2 CRI: Handle IOException caused by client cancellation In Review Dean Ward Within 14 days 2 707998003 AOAI Copilot services have increased gateway overhead after enabling Gateway v2 No Backend 2025-11-12
35868374 AOAI - Invoke-Request Policy Does Not transfer Private Link Info In Review Mahsa Sadi Within 14 days 2 686071385 [Databricks] private endpoint access to Azure AI Foundry getting denied with 403... No Backend 2025-11-11
35880145 AOAI - Invoke-Request Does not Transfer LifeTimeScope.TraceUploader In Review Unassigned Within 14 days 2 686071385 [Databricks] private endpoint access to Azure AI Foundry getting denied with 403... No Backend 2025-11-11
3425163 Detect changes in MSI token failures scoped to particular category New Harish Rane 3 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-11-11
35864723 Fail fast and add cx facing error message when attempting to add internal vnet for standardv2 sku. To Do Ajinkya Shendre Within 30 days 2 51000000772946 VNet integration failure - 2511080040001550 - Standard V2 Service Sku No ServicingLoop 2025-11-10
3424180 Create repair item on Power Platform/API hub to adopt PP SDK. To Do Harish Rane 1 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-11-10
3424137 Detect changes in MSI token failures scoped to particular category New Harish Rane 3 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-11-10
35858129 Improve event processing SLA alert to not open on transient processing issues To Do Nima Kamoosi Within 30 days 2 708731316 [Public] eastus2euap (4) - [Gateway]: Event processing SLA below 99.5% with ove... No Backend 2025-11-10
35858115 Tunnel Proxy is not able to serve connection requests due to "The requested address is not valid in its context.&qu... New Maxim Kim Within 30 days 2 708731316 [Public] eastus2euap (4) - [Gateway]: Event processing SLA below 99.5% with ove... No Backend 2025-11-10
35857971 Extend "Gateway event processing failure" TSG to identify if issue is persistent or transient To Do Tom Kerkhove Within 14 days 2 708731316 [Public] eastus2euap (4) - [Gateway]: Event processing SLA below 99.5% with ove... No Backend 2025-11-10
35857819 Tunnel proxy does not support correlating logs New Maxim Kim Within 30 days 2 708731316 [Public] eastus2euap (4) - [Gateway]: Event processing SLA below 99.5% with ove... No Backend 2025-11-10
35843609 The delay of 3h 26m has been observed for outage declaration from the Impact start time To Do Samir Solanki Within 30 days 2 707587308 [Public] More than 4 customer orchestrations failed due to lack of AZ support fo... Yes Backend 2025-11-07
35841058 [ASD] Detect ICANN Verification Pending for close to 15 days To Do Linlu Liu Within 14 days 2 701345373 WA-WebSites:APIHUB [Domains] [App Service Domain Possibly Parked] No ServicingLoop 2025-11-07
35831483 Investigate and mitigate other regions and scale units for database overloads New Rafal Mielowski Within 14 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831478 Investigate posibility of getting metrics for database, vms etc directly from geneva for alerting and investigations New Unassigned Within 60 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831475 Investigate usage of application insights and azure monitor to pipe metrics from production resurces to our geneva space New Unassigned Within 60 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831466 Investigate posibility of alert on smapi VM utilization in scaleunit New Unassigned Within 60 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831453 Introduce short lived response caching in SMAPI New Rafal Mielowski Within 60 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831446 Introduce control plane request throttling on client request from RP in SMAPI New Rafal Mielowski Within 60 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831439 Improve performance of checking provisioningState field value from OperationResults table in SMAPI New Unassigned Within 60 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831435 Add SMAPI database autoscale on utilization alert In Review Rafal Mielowski Within 14 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831426 Add SMAPI database utilization alerts New Unassigned Within 14 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831413 Improve SMAPI SLA alert to include duration as factor of SLA calculation In Progress Rafal Mielowski Within 14 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831410 Add new SMAPI SLA alert which would use Antares logs for SLA calculation In Progress Rafal Mielowski Within 14 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35831325 Add DTU utilization for smapi database to dashboards To Do Rafal Mielowski Within 14 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-11-06
35814965 Update policy forms to automatically create named values for secret values New Javier Borrego Within 30 days 2 31000000457979 [MSRC] [101914] - ElevationOfPrivilege - EOP to get full access to APIM APIs wit... No Backend 2025-11-05
35810061 Ensure Kestrel properly queues requests when getting overwhelmed New Dean Ward Within 30 days 2 703227775 Elevated 503s for M365 Copilot in Australia East and UKSouth No Backend 2025-11-04
35807139 Do not connect to Redis nodes nor notify others about presence in neighborhood until gateway started successfully New Unassigned Within 30 days 2 703227775 Elevated 503s for M365 Copilot in Australia East and UKSouth No Backend 2025-11-04
35807063 Introduce alert to identify 503 increase after service upgrade New Unassigned Within 30 days 2 703227775 Elevated 503s for M365 Copilot in Australia East and UKSouth No Backend 2025-11-04
35805781 Introduce service & cache summary (ie type) in Redis investigation dashboard To Do Unassigned Within 14 days 2 703227775 Elevated 503s for M365 Copilot in Australia East and UKSouth No Backend 2025-11-04
35805778 Introduce connection lifecycle in Redis investigation dashboard To Do Unassigned Within 14 days 2 703227775 Elevated 503s for M365 Copilot in Australia East and UKSouth No Backend 2025-11-04
35805773 Introduce role instance filter in Redis investigation dashboard In Progress Dean Ward Within 14 days 2 703227775 Elevated 503s for M365 Copilot in Australia East and UKSouth No Backend 2025-11-04
3418714 End to end test fails when discovers contract change on existing api-version New Harish Rane 3 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-11-04
3418559 Formalize the plan in delivering ESTS-R to Global/Geo Azure resources with newer api-version on MSI dataplane To Do Praveen Erode Murugesan 3 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-11-04
3418556 Review current Partner facing DevEX on using ESTS-R, turning off instance discovery New Divyansh Manchanda 3 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-11-04
35791780 Return the correct new runtime DNS for new environment in swagger on host property To Do Unassigned 2 2 21000000754738 [Strategic] All Users are getting an error message every time when interacting w... No ServicingLoop 2025-11-03
3418038 Formalize the plan in delivering ESTS-R to Global/Geo Azure resources with newer api-version on MSI dataplane To Do Harish Rane 3 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-11-03
3418018 Review current Partner facing DevEX on using ESTS-R, turning off instance discovery New Divyansh Manchanda 3 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-11-03
35790993 Internal IPs are null for internal VNET service when upgrade goes through upgrade via binairies In Review Omar Adalid Macias Mayorquin Within 60 days 2 21000000757625 Internal VNET APIM is missing the private IPs No Backend 2025-11-03
3417970 Share sample code on ARMWiki on how to use ClientCertificateCredential for Azure RPs to follow that uses ESTS-R, custom ... New Harish Rane 1 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-11-03
3417969 End to end test fails when discovers contract change on existing api-version New Harish Rane 3 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-11-03
35789157 Kudu does not properly stop website on version upgrade In Review Tom Kerkhove Within 14 days 2 703227775 Elevated 503s for M365 Copilot in Australia East and UKSouth No Backend 2025-11-03
35788530 Introduce additional Gateway v2 enablement RP integration test with version upgrade New Tom Kerkhove Within 30 days 2 703227775 Elevated 503s for M365 Copilot in Australia East and UKSouth No Backend 2025-11-03
35772450 [ASD] Add an alert for 422 errors and Parked domains for early detection To Do Linlu Liu Within 14 days 2 701345373 WA-WebSites:APIHUB [Domains] [App Service Domain Possibly Parked] No ServicingLoop 2025-10-31
35772405 [ASD] Surface GoDaddy GET/DELETE changeOfRegistrant API To Do Linlu Liu Within 60 days 2 701345373 WA-WebSites:APIHUB [Domains] [App Service Domain Possibly Parked] No ServicingLoop 2025-10-31
32956625 Create a banner or Alert on the portal once the domain is in Parked State with instructions on how to verify the domain ... To Do Linlu Liu Within 60 days 2 701345373 WA-WebSites:APIHUB [Domains] [App Service Domain Possibly Parked] No ServicingLoop 2025-10-31
35496255 [Service Health Repair Item] Use anomaly detection for timeouts across a large number of customers to declare outage. New Ankur Goyal Within 60 days 2 695755987 [CSS] -AOAI-Reliability [SAP] [Lantency anomaly boost, number of anomalies 408] Yes Backend 2025-10-30
35622581 [Repair Item] APIM: Use "external-then-internal" for cache type in APIM Active Das Partha P. Within 30 days 2 695755987 [CSS] -AOAI-Reliability [SAP] [Lantency anomaly boost, number of anomalies 408] Yes Backend 2025-10-30
35622589 [Repair Item] APIM: Migrate remaining AOAI regions to external redis cache Committed Das Partha P. Within 14 days 2 695755987 [CSS] -AOAI-Reliability [SAP] [Lantency anomaly boost, number of anomalies 408] Yes Backend 2025-10-30
35734707 [Repair item] Build isolation between 1P and 3P APIM stamps separating their Redis cache Committed Das Partha P. Within 14 days 2 695755987 [CSS] -AOAI-Reliability [SAP] [Lantency anomaly boost, number of anomalies 408] Yes Backend 2025-10-30
35750267 [Repair Item] Alert based on request processing time in APIM New Bharadwaj Kura Within 30 days 2 695755987 [CSS] -AOAI-Reliability [SAP] [Lantency anomaly boost, number of anomalies 408] Yes Backend 2025-10-30
35746794 Gateway: Add support for opting out of cache notifications New Dean Ward Within 30 days 2 695755987 [CSS] -AOAI-Reliability [SAP] [Lantency anomaly boost, number of anomalies 408] Yes Backend 2025-10-30
35746834 Gateway: Address read / write high timeout issues with Redis New Dean Ward Within 30 days 2 695755987 [CSS] -AOAI-Reliability [SAP] [Lantency anomaly boost, number of anomalies 408] Yes Backend 2025-10-30
33530925 Caching: Add support for "stale" cache entries with proactive refresh and stampede protection Active Dean Ward Within 60 days 2 695755987 [CSS] -AOAI-Reliability [SAP] [Lantency anomaly boost, number of anomalies 408] Yes Backend 2025-10-30
35757295 Update Build-Out Process: Add billing validation steps before opening new regions. To Do Neha Gupta Within 30 days 2 698012657 [Usage Record Errors] PAUsageRecordErrors [ApiManagement] is unhealthy. No Backend 2025-10-30
35757285 Emit Only Valid Billing Records: Discard old billing records when onboarding new meters to avoid noisy alerts. To Do Shilpa Mani Within 30 days 2 698012657 [Usage Record Errors] PAUsageRecordErrors [ApiManagement] is unhealthy. No Backend 2025-10-30
35748185 [VMSS] "Staleness" of the VM extension state blocking VMSS deployments New Unassigned Within 30 days 2 702923645 APIM is have network issue No ServicingLoop 2025-10-30
33961714 Update MCS connector invoke logic based on PowerFx change New Ivy Ling 3 2 21000000754738 [Strategic] All Users are getting an error message every time when interacting w... No ServicingLoop 2025-10-29
35741251 Introduce large-scale test scenario to simulate AOAI-like Redis setup New Dean Ward Within 60 days 2 703227775 Elevated 503s for M365 Copilot in Australia East and UKSouth No Backend 2025-10-29
35737303 Provide information around used Redis in ASI & Redis Dashboard New Tom Kerkhove Within 30 days 2 703227775 Elevated 503s for M365 Copilot in Australia East and UKSouth No Backend 2025-10-29
35735443 Introduce BVT to ensure SubscriptionKeyNotFound is reported correctly New Tom Kerkhove Within 30 days 2 696484578 STRATEGIC CUSTOMER | 2510070040007321 | Request to exclude the APIM instances fr... No ServicingLoop 2025-10-29
35726501 [ASD] Add a substatus for PARKED_VERIFICATION_ICANN To Do Linlu Liu Within 60 days 2 701345373 WA-WebSites:APIHUB [Domains] [App Service Domain Possibly Parked] No ServicingLoop 2025-10-28
35726476 [ASD] Update internal TSGs to unblock parked domain To Do Linlu Liu Within 14 days 2 701345373 WA-WebSites:APIHUB [Domains] [App Service Domain Possibly Parked] No ServicingLoop 2025-10-28
35724141 Add ASI link to SMAPI SkuV1 dependency. To Do Branimir Giurov Within 14 days 2 695450880 Premium APIM failing update to 0.49 No Platform(InternalOnly) 2025-10-28
35713958 [ASD][Docs] - Update docs about transfer out to include ICANN verification and potential DNS changes To Do Yutang Lin Within 30 days 2 701345373 WA-WebSites:APIHUB [Domains] [App Service Domain Possibly Parked] No ServicingLoop 2025-10-27
35708347 Follow-up with BRAIN team on why the subscriptions impacted dashboard shows empty during the entire icm timeline To Do Branimir Giurov Within 14 days 2 703101770 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-10-27
35699338 alert(platform) Add alert to detect this SMAPI bad MSI state with threshold of 1 service per region To Do Gleb Feoktistov Within 14 days 2 695540880 CHUBB | SEV 1| 500 error when trying to add an API No Backend 2025-10-24
35677670 MSAL Library should use WithAzureRegion where possible to avoid an extra instance discovery call In Review Martin Dechev 3 2 700863452 [BLEU Cloud Buildout] : Remote Cert Invalid error when query discovery Endpoints... No Backend 2025-10-22
35677486 BLEU Cloud blocks the CRL endpoint for DigiCert To Do Samir Solanki 1 2 700863452 [BLEU Cloud Buildout] : Remote Cert Invalid error when query discovery Endpoints... No Backend 2025-10-22
35605356 Sync Public IP FQDN code skips updating management endpoint DNS New Unassigned Within 60 days 2 694632568 Customers making requests to [MICROSOFT.APIMANAGEMENT] (on endpoint api-noe-prod... No Backend 2025-10-16
35605129 Alert and auto-mitigate management endpoint DNS issue due to public IP FQDN update To Do Unassigned Within 30 days 2 694632568 Customers making requests to [MICROSOFT.APIMANAGEMENT] (on endpoint api-noe-prod... No Backend 2025-10-16
3405003 [Brain Coverage] - Improve SLI Quality/Coverage to address missing QCO outage detection for Incident 689298102(Brain Gen... To Do Cynthia Ibarra 3 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-10-16
35588116 Improve Logging for Health Monitor Manager for Enqueu job and start stop failures To Do Unassigned Within 14 days 2 696825053 [api-cy4-prod-01-rp] API RP Orchestration Alert: RegionalRpHealthMonitorJob Orch... No SMAPI 2025-10-15
35586795 Revert Changes to the SetBackend Policy - Set BaseUrl should not set Context.LifeTimeScope.BackendId (Breaking Change) In Review Unassigned Within 14 days 2 695839076 Cx wishes to learn details about recent APIM update No Backend 2025-10-15
35584808 fix(platform) Log successful billing events along with the errors New Nina Ren Within 30 days 2 698012657 [Usage Record Errors] PAUsageRecordErrors [ApiManagement] is unhealthy. No Backend 2025-10-15
35576440 fix(platform) Ensure serialization of billing event data read from database is in UTC and NOT dependent on machine timez... New Nina Ren Within 30 days 2 698012657 [Usage Record Errors] PAUsageRecordErrors [ApiManagement] is unhealthy. No Backend 2025-10-15
35509091 Make sure policy forms store secrets as named values New Javier Borrego Within 30 days 2 31000000457979 [MSRC] [101914] - ElevationOfPrivilege - EOP to get full access to APIM APIs wit... No Backend 2025-10-10
35388178 Do not log connectivity issue to backend as error in ProxyRequest for forward-request New Tom Kerkhove Within 30 days 2 695851120 BRAIN detected an unusual trend in SLI "Success Rate" for API Manageme... No Backend 2025-10-09
35499024 Resolving Product Scope when using Api Scoped scubscription In Progress Ansul Goenka Within 14 days 2 695211616 CRI- Open product is defaulting to the request made with or without API Scoped s... No ServicingLoop 2025-10-09
35468623 fix(SMAPI) Ensure SMAPI MSI token acquisition process is resilient with missing settings New Unassigned Within 30 days 2 695540880 CHUBB | SEV 1| 500 error when trying to add an API No Backend 2025-10-08
35468610 fix(platform) Ensure platform rollback also rolls back or reset the state of components like SMAPI New Unassigned Within 30 days 2 695540880 CHUBB | SEV 1| 500 error when trying to add an API No Backend 2025-10-08
35452850 diagnose(gw) Add detector to detect and diagnose Gateway poison message state To Do Nima Kamoosi Within 60 days 2 689371200 404 resource not found error when trying to consume the API - 2509230040008478 No Gateway 2025-10-07
3384238 [Brain Coverage] - Improve SLI Quality/Coverage to address missing QCO outage detection for Incident 689298102(Brain Gen... To Do Unassigned 1 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-10-07
35449624 alert(gw) Log Sev3 IcM if less than threshold (4) services are affected by event poison issue To Do Nima Kamoosi Within 30 days 2 689371200 404 resource not found error when trying to consume the API - 2509230040008478 No Gateway 2025-10-07
3394982 [Update TSG] Ensure RP Readiness for ESTS-R New Harish Rane 1 2 692175630 Incorrect Authentication URL Being Sent During Managed Identity Credentials for ... No ServicingLoop 2025-10-07
35432809 [Brain Coverage] Improve SLI Quality/Coverage to address missing QCO outage detection for Incident 691848179 (Service H... To Do Unassigned Within 60 days 2 691848179 APIHub [HealthMonitor] PowerPlatform Dataplane Failure Rate > 1% for power-ap... Yes ServicingLoop 2025-10-06
35415336 LLM deserialization fails when payload contains null "content" property New Ethan Lao Within 14 days 2 690634373 2509240040007745 | LLM Logging - request body is NULL on eventhubs No Backend 2025-10-03
35407879 Provide public documentation on not doing path traversal protection New Unassigned Within 14 days 2 31000000453736 [MSRC] [101723] - Azure - ElevationOfPrivilege - Critical Security Boundary Bypa... No Gateway 2025-10-03
35395678 [SLA Dip][CRI] While deleting Groups we were running into the error for sequence contains no matching element In Review Macko Treder Within 30 days 2 692607867 Not able to delete a group in APIM service - the operation returns an internal s... No Backend 2025-10-02
35388021 Auto-transfer Gateway Sev3s to gateway loop when instructed to do so New Unassigned Within 14 days 2 691978131 [Public] centraluseuap-SkuV1 (Er:12 x Svc:12) - [v2][Gateway]: 2+ new Gateway R... No Backend 2025-10-02
35387795 Auto-create "new Gateway Requests error types detected" alert as Sev3 when BVT service only New Tom Kerkhove Within 14 days 2 691978131 [Public] centraluseuap-SkuV1 (Er:12 x Svc:12) - [v2][Gateway]: 2+ new Gateway R... No Backend 2025-10-02
35345658 Do not fire "[v2][Gateway]: 2+ new Gateway Requests error types detected on 10+ upgraded services anytime in past 1... In Review Tom Kerkhove Within 14 days 2 692092698 [Public] centraluseuap-SkuV1 (Er:14 x Svc:14) - [v2][Gateway]: 2+ new Gateway R... No ServicingLoop 2025-09-30
35344650 apim-bvt-websocket-server.westus.cloudapp.azure.com:8010 is not reachable New Unassigned Within 14 days 2 691978131 [Public] centraluseuap-SkuV1 (Er:12 x Svc:12) - [v2][Gateway]: 2+ new Gateway R... No Backend 2025-09-30
3385905 Alert on issuance for any certificates in Production not published to GetIssuer in respective cloud New Sonal Rajul Danak Within 14 days 2 689984195 [api-euapbn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SL... No Backend 2025-09-25
3385893 Ensure OneCert integration with GetIssuer is not bypassed during certificate issuance New Oleksii Pechenev Within 14 days 2 689984195 [api-euapbn1-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SL... No Backend 2025-09-25
35137221 Fix ACIS Update API Service Container re: validation failing when trying to update MSI attributes New Shilpa Mani Within 30 days 2 689677560 APIM Premium Tier Service Down Following Capacity Spike No Backend 2025-09-25
35089445 fix(gw) Fix NullReferenceException thrown while processing Backend event in Gateway New Mahsa Sadi Within 30 days 2 689371200 404 resource not found error when trying to consume the API - 2509230040008478 No Gateway 2025-09-24
35061486 Workspace gateway termination should can trigger billing New Vidisha Shah Within 30 days 2 670389034 [Usage Record Errors] PAUsageRecordErrors [APIManagement] is unhealthy. No Backend 2025-09-24
34994884 Remove client IP scrubbing from DGrep To Do Vitalii Kurokhtin Within 60 days 2 670556948 Immediate Fix Required: Regression in Kusto Cluster Impacting GHOST Threat Hunte... No Backend 2025-09-19
34976852 Log content type and content length header in ProxyRequest and GatewayOutgoingRequests Active Mahsa Sadi Within 30 days 2 682180143 [CSS] -AOAI-Experience-Platform [Jitu] [Southeast Asia APIM client time surge ca... No Backend 2025-09-18
34974319 Billing should perform auto restart upon failure To Do Nina Ren Within 30 days 2 686979342 [api-ase-prod-01-rp] API RP Orchestration Alert: BillingConsumptionSku Orchestra... No ServicingLoop 2025-09-18
34962314 Disassociate APIM Infra Resources with Cx Owned Resources Upon VMSS Change To Do Shilpa Mani Within 30 days 2 681097514 Not able to assign public IP with new subnet - JLL - 2509060040001054 No Backend 2025-09-18
34961837 Missing subscription whitelisting for new SKUv1 consumption subscriptions Committed Kriti Majumdar Within 30 days 2 686769141 [api-pn1-prod-01-rp] API RP Alert: Unhealthy ActivateConsumptionService orchestr... No Backend 2025-09-18
34954014 Adding ServiceStorage account to RP, does not push configuration to Consumption SMAPI New Samir Solanki Within 30 days 2 686769141 [api-pn1-prod-01-rp] API RP Alert: Unhealthy ActivateConsumptionService orchestr... No Backend 2025-09-17
34917026 Identify high traffic Power regions and sku up ASPs to P3v3 To Do Michael Rowden Within 30 days 2 675182331 APIHub [HealthMonitor] PowerPlatform Dataplane Failure Rate > 1% for power-ap... No ServicingLoop 2025-09-15
34917008 Create alert for massive traffic increase In Progress Michael Rowden Within 14 days 2 675182331 APIHub [HealthMonitor] PowerPlatform Dataplane Failure Rate > 1% for power-ap... No ServicingLoop 2025-09-15
34854644 Block customers from configuring auto-scale on Max Capacity metric on API level (if possible) To Do Ansul Goenka Within 30 days 2 670844126 Apim No Backend 2025-09-11
34854639 Auto-scale best practices re: don't use auto-scale based on Max Capacity metric To Do Sreekanth Thirthala Venkata Within 14 days 2 670844126 Apim No Backend 2025-09-11
34854603 Implement CRI validation wf for APIM and APIHub To Do Branimir Giurov Within 14 days 2 677671854 RCA on "Consumption Workflow Trigger execution success rate" for Logic... No Backend 2025-09-11
34854565 Transient severity and partner support team education To Do Branimir Giurov Within 14 days 2 677671854 RCA on "Consumption Workflow Trigger execution success rate" for Logic... No Backend 2025-09-11
34854547 APIM connectivity and troubleshooting tips for 1P teams To Do Michael Rowden Within 14 days 2 677671854 RCA on "Consumption Workflow Trigger execution success rate" for Logic... No Backend 2025-09-11
34854538 Implement periodic tracert for the active backend To Do Maxim Kim Within 30 days 2 677674711 api gateway traffice failing with 500 No Backend 2025-09-11
34854527 Docs should clearly state that customers shouldn't restrict private IP addresses To Do Sreekanth Thirthala Venkata Within 14 days 2 677674711 api gateway traffice failing with 500 No Backend 2025-09-11
34851918 Add link to data-plane investigations dashboard in ASI To Do Unassigned Within 14 days 2 682180143 [CSS] -AOAI-Experience-Platform [Jitu] [Southeast Asia APIM client time surge ca... No Backend 2025-09-11
34789976 Activation failing during managed identity provisioning New Unassigned Within 30 days 2 675900593 [api-jinc-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (... No ServicingLoop 2025-09-08
34789948 Activation fails on assigning storage permission to Managed Identities because it was not found (not created?) New Unassigned Within 30 days 2 676479213 [api-idc-prod-01-rp] API RP Alert: SKUv1 Activation SuccessRate Below 95% SLA (w... No Backend 2025-09-08
34634884 Enable Brain Autocomms for ARM RP SLI - /PROVIDERS/MICROSOFT.APIMANAGEMENT/SERVICE PUT (Error Budget) (Service Health Re... To Do Branimir Giurov Within 30 days 2 669728054 BRAIN detected an unusual trend in SLI "ARM RP SLI - /PROVIDERS/MICROSOFT.A... Yes Backend 2025-08-29
34632778 [LiveSite] - Scaling up Redis Cache SKU from P2 -> P3 for eur002-002 In Progress Ji Hoon Kim Within 14 days 2 677289854 [Public] ApiHubAdminFrontEnd [Kusto]: High Response Time - power-rp-europe002-00... No ServicingLoop 2025-08-29
34596741 Time range filter is not honored for container capping information in ASI New Unassigned Within 14 days 2 676140348 [Deloitte (O365D), S500, 2508250040008386] Capacity spike on Premium SKU service... No Backend 2025-08-27
34596738 ASI is showing disconnected information without providing a warning New Unassigned Within 14 days 2 676140348 [Deloitte (O365D), S500, 2508250040008386] Capacity spike on Premium SKU service... No Backend 2025-08-27
34596727 Tenant capping is calculated on old SKU, and not new, during scale up/down New Roman Kolesnikov (APIM) Within 14 days 2 676140348 [Deloitte (O365D), S500, 2508250040008386] Capacity spike on Premium SKU service... No Backend 2025-08-27
34571460 feature(gw) Ensure we log ProxyRequest user agent as part of security auditing log enablement Active Nima Kamoosi Within 30 days 2 667991921 [Liberty Shield 3] Collect JA4 Logs for your service (6ba70dfa-ead9-4cc1-b894-04... No Backend 2025-08-25