Incidents | Hyper Platform Services Incidents reported on status page for Hyper Platform Services https://status.hyperar.com/ https://d1lppblt9t2x15.cloudfront.net/logos/fbbe9bfcde0e79114c5793c52cfdffea.png Incidents | Hyper Platform Services https://status.hyperar.com/ en Hyper Portal is down https://status.hyperar.com/incident/507666 Wed, 05 Feb 2025 11:41:55 -0000 https://status.hyperar.com/incident/507666#3e9aec9cbe82b26e4e9096e86a84ad8033c06ec1dd4ec43c73e6e849ef7d74f4 Hyper Portal recovered. Hyper Portal is down https://status.hyperar.com/incident/507666 Wed, 05 Feb 2025 11:20:47 -0000 https://status.hyperar.com/incident/507666#e3b0fc630490d6bf04e58b0c6dbbe37ca6dfa8f3b46297b87cb05df3ccf39edf Hyper Portal went down. Hyper Portal is down https://status.hyperar.com/incident/476122 Wed, 11 Dec 2024 18:10:58 -0000 https://status.hyperar.com/incident/476122#7e66e78b11d3b12cd11eae89cda08462b60d73f753c761a9259af0ad6c05165e Hyper Portal recovered. Hyper Portal is down https://status.hyperar.com/incident/476122 Wed, 11 Dec 2024 18:05:44 -0000 https://status.hyperar.com/incident/476122#972fed4cfa7b1447741bce82e7f3e5a9c1c0d435925061191d1b2463e69db8cb Hyper Portal went down. Preventative measures https://status.hyperar.com/incident/469785 Fri, 29 Nov 2024 17:00:24 +0000 https://status.hyperar.com/incident/469785#9b20e65c607fb67b3c7485413c2b756b8b95e868f84c9a4c934305752e62471d Maintenance completed Preventative measures https://status.hyperar.com/incident/469785 Fri, 29 Nov 2024 14:30:24 -0000 https://status.hyperar.com/incident/469785#732bc1248075d11ba199068981dc950b20ddac6f04ac6a535cb27064b1e7fd03 We are aware of an issue with upstream packages in our development and QA environments. We are taking preventative measures to ensure production uptime. Deployment https://status.hyperar.com/incident/454403 Fri, 01 Nov 2024 12:15:04 +0000 https://status.hyperar.com/incident/454403#74de6f83d7476165daf1b69e9774edccbc400c55e1a8a227b53a0340be9bb886 Maintenance completed Deployment https://status.hyperar.com/incident/454403 Fri, 01 Nov 2024 11:15:04 -0000 https://status.hyperar.com/incident/454403#09457175916e0e9504edf27e91416804649d479d7af92ac1cafc99440a30ffaf Deployment Deploy Update https://status.hyperar.com/incident/449917 Thu, 24 Oct 2024 12:10:05 +0000 https://status.hyperar.com/incident/449917#b8202dcd29b383167489681c0b75dd7306393115b68b46b3e19f970aac353fb7 Maintenance completed Deploy Update https://status.hyperar.com/incident/449917 Thu, 24 Oct 2024 11:10:05 -0000 https://status.hyperar.com/incident/449917#8d31a14fa123451dcfd58a8c97d7bcd6db789e82b005816a3d546ec0126eb2c0 Deploy Update Hyper Portal is down https://status.hyperar.com/incident/448147 Mon, 21 Oct 2024 12:40:00 -0000 https://status.hyperar.com/incident/448147#e6456739d5231cfd686573a794a36a0b4d299290ffb586460266651c63a9820f ### 1. Incident Overview The Hyper Portal was affected by a complete service outage. The issue was caused by an update to a dependency breaking SemVer conventions. The incident was raised by automated health check at 12:45 BST on Monday 21th October. The cause was identified and resolved by 13:06 BST. ### 2. Incident Details When scaling the Hyper Portal pulls its runtime dependencies from NPM. A change in the dependency [pdf-to-png-converter](https://www.npmjs.com/package/pdf-to-png-converter) broke backwards compatibility without a major version bump. This caused new instances of the Hyper Portal to fail to start up. With the service unreachable, the load balancer reported back HTTP 502 errors. ### 3. Incident Response 12:45:26 BST 2024-10-21 Automated monitoring reports the outage. 12:45:59 BST 2024-10-21 Harry Jarman (Web Lead) acknowledges the incident. 13:01:45 BST 2024-10-21 Harry Jarman (Web Lead) creates, publishes, and begins deployment of hotfix. 13:06:27 BST 2024-10-21 Automated monitoring confirms deployment successful and restoration of service. ### 4. Root Cause Analysis The issue was caused by an automatic update to a package which broke backwards compatibility within a minor version bump. This was because the package was versioned to allow minor bumps, which is the default behaviour of NPM. ### 5. Impact and Consequences The overall impact of the outage was a loss of service for the SDKs, and mapping tools. ### 6. Incident Closure The incident was closed at 13:06 BST upon internal confirmation that the issue was no longer occurring. Hyper Portal is down https://status.hyperar.com/incident/448147 Mon, 21 Oct 2024 12:09:52 -0000 https://status.hyperar.com/incident/448147#2379bad13edb55776e37d341875b263dc832ec70a9ad2bdbb41273e3bbc3e63f Hyper Portal recovered. Hyper Portal is down https://status.hyperar.com/incident/448147 Mon, 21 Oct 2024 11:45:37 -0000 https://status.hyperar.com/incident/448147#2d2fc7e938ad37467ceadff7a40944e333b2e45cd4e5e87fc63391838cb770ed Hyper Portal went down. Updating infrastructure https://status.hyperar.com/incident/441152 Tue, 08 Oct 2024 11:51:45 +0000 https://status.hyperar.com/incident/441152#24c4ab5a35a6003312ff40924ac07ccdfadd11fb2b7bc6d0cfc171864b0d5b1a Maintenance completed Updating infrastructure https://status.hyperar.com/incident/441152 Tue, 08 Oct 2024 10:51:45 -0000 https://status.hyperar.com/incident/441152#a92f7e960c50099fbe9b06448ef0172f74427f2d29d998d9f2e4299475a18cb8 Updating infrastructure Release https://status.hyperar.com/incident/435190 Thu, 26 Sep 2024 15:30:00 +0000 https://status.hyperar.com/incident/435190#7f57387bd0592c9098a42c5c16eb4655e5c24142463870d46502b02e4b485a8e Maintenance completed Release https://status.hyperar.com/incident/435190 Thu, 26 Sep 2024 14:10:46 -0000 https://status.hyperar.com/incident/435190#c98179d766483686b1c49188506cabf29bf6329d8b0d08236610ce66c686c20c Standard release Some HMDFs unavailable for download https://status.hyperar.com/incident/427143 Tue, 10 Sep 2024 14:02:00 -0000 https://status.hyperar.com/incident/427143#c8146197fcd6aaae997c58d43ccc79b6f4ee5be955c9e45bf3a85d69d0d4aa9a # Incident reported: 10/09/2024 15:02 Requests for HMDF versions 2.1.0 returning HTTP 404. ## Update: 10/09/2024 15:19 Issue identified. The cause of this is the API failing to upgrade the request to the latest patch version of HMDF, in this case 2.1.1. ## Update 10/09/2024 15:40 Fix implemented and tested by developers. ## Update 10/09/2024 16:40 All instances in production environment are now running the fixed version. ## Update 10/09/2024 16:44 Incident resolved. Platform upgrade https://status.hyperar.com/incident/427025 Tue, 10 Sep 2024 11:00:00 +0000 https://status.hyperar.com/incident/427025#6c932a0ba333cfbc8477e72f9af0da1713e627e673a02f9fb0bbe4f8843a5bf5 Maintenance completed Platform upgrade https://status.hyperar.com/incident/427025 Tue, 10 Sep 2024 10:00:00 -0000 https://status.hyperar.com/incident/427025#82d9ee0273ca62b50050600af316c21a3d2f2249ab0f4add1105dc774747bd0d Upgrading HyperPortal to release 2024.9.1 Hotfix Wifi https://status.hyperar.com/incident/421237 Thu, 29 Aug 2024 13:00:00 +0000 https://status.hyperar.com/incident/421237#298b9ecd9fef61cd364dedd412650eb66c22d1d49cbf81f01924339730fe4e7e Maintenance completed Hotfix Wifi https://status.hyperar.com/incident/421237 Thu, 29 Aug 2024 12:00:00 -0000 https://status.hyperar.com/incident/421237#037ba961adc34e3d4bb75c1431b43f73ce97bcd02b5885b0ad136dd99e627522 Hotfix Hot fix https://status.hyperar.com/incident/412598 Mon, 12 Aug 2024 12:03:00 +0000 https://status.hyperar.com/incident/412598#fe7dba708bdccdee5c9b588b4a7ceff00f78b251518be54b1a14076838d938d4 Maintenance completed Hot fix https://status.hyperar.com/incident/412598 Mon, 12 Aug 2024 11:03:00 -0000 https://status.hyperar.com/incident/412598#7a187da24ca18fd4b1f2955486c4c0e8bb93ff6fda6643a26f75c329882c90ac hot fix Deployment https://status.hyperar.com/incident/410884 Thu, 08 Aug 2024 15:34:00 +0000 https://status.hyperar.com/incident/410884#299de9113e9b95d937f7b85db80f1dbfd23f29a31253355ad66018b72e77f826 Maintenance completed Deployment https://status.hyperar.com/incident/410884 Thu, 08 Aug 2024 14:34:00 -0000 https://status.hyperar.com/incident/410884#9d17071713e1136edaf0144a9ade446491637013563aed262e64a0d4ad763711 Deployment Deployment https://status.hyperar.com/incident/410159 Wed, 07 Aug 2024 09:05:00 +0000 https://status.hyperar.com/incident/410159#4c356b062722b4408968d34996ff9f2655d7d547d70ffbfe48fdda4b57d9560b Maintenance completed Deployment https://status.hyperar.com/incident/410159 Wed, 07 Aug 2024 08:05:00 -0000 https://status.hyperar.com/incident/410159#4f1340d88be477b26e91949e91822c24c88bf2b6ea4cfa6cacd05ddb88b15428 Deploying a hotfix Hotfix deployment https://status.hyperar.com/incident/401744 Mon, 22 Jul 2024 16:00:00 +0000 https://status.hyperar.com/incident/401744#3684c8073d3ab837c85387680b5420d1d010bd559b4541b08c35cd0cfe329ed6 Maintenance completed Hotfix deployment https://status.hyperar.com/incident/401744 Mon, 22 Jul 2024 15:30:00 -0000 https://status.hyperar.com/incident/401744#930152afc704d2a04e7dd5fef0257a74e06d4d1db7d6b3cf84c7e44793e921a1 Deployment Hotfix deployment https://status.hyperar.com/incident/399458 Wed, 17 Jul 2024 11:45:00 +0000 https://status.hyperar.com/incident/399458#172bc2348d36d4269500955ce36590c5e1215c2b7c6b4cf64b8754515913153f Maintenance completed Hotfix deployment https://status.hyperar.com/incident/399458 Wed, 17 Jul 2024 10:45:00 -0000 https://status.hyperar.com/incident/399458#cfe2368e8321f69d6b058b0bb9c1b400e6959bc5289c97838bfff649bf8332a0 Deploying a hotfix Deployment https://status.hyperar.com/incident/398966 Tue, 16 Jul 2024 14:35:00 +0000 https://status.hyperar.com/incident/398966#0af4d61a2b288aea89bc7ca4c9beebf3dab07a128c9f2a6082ff0ae7dd18dfcb Maintenance completed Deployment https://status.hyperar.com/incident/398966 Tue, 16 Jul 2024 14:25:00 -0000 https://status.hyperar.com/incident/398966#03ceaedca26d7714aae6468200f09ce2f1434adac6a396b716e231003b5d9319 Hotfix deployment Deployment https://status.hyperar.com/incident/398862 Tue, 16 Jul 2024 11:30:00 +0000 https://status.hyperar.com/incident/398862#b0c24132d4dc25a227155a3af90300ed752c7b8dd11ed5116910e253610e8be0 Maintenance completed Deployment https://status.hyperar.com/incident/398862 Tue, 16 Jul 2024 10:30:00 -0000 https://status.hyperar.com/incident/398862#4ec3b67a4ff93536e599949b630c8661651f4443cf7e69dc072b5cf64f757d59 Deploying new version New Release https://status.hyperar.com/incident/393541 Thu, 04 Jul 2024 15:15:00 +0000 https://status.hyperar.com/incident/393541#248765561131f5fcf1ff9f7a489f71fac91022c3bf528abba6653174a79722c5 Maintenance completed New Release https://status.hyperar.com/incident/393541 Thu, 04 Jul 2024 12:15:00 -0000 https://status.hyperar.com/incident/393541#a76255b0ad45927a1bb5559330229567305d81232764eb89353c3304dd95e0f0 Deployment of a new release Hot Fix Deployment https://status.hyperar.com/incident/383218 Wed, 12 Jun 2024 12:00:00 +0000 https://status.hyperar.com/incident/383218#651851929195608dacf1272f2a719b04e0d18925b8da0ca66f221f5745b7e780 Maintenance completed Hot Fix Deployment https://status.hyperar.com/incident/383218 Wed, 12 Jun 2024 11:30:00 -0000 https://status.hyperar.com/incident/383218#d19edc33be345561d93c22e2a8b84fd1ae1372d51524513aed3ab9874baed31e We are deploying two hot fixes Minor infrastructure update https://status.hyperar.com/incident/370197 Thu, 16 May 2024 11:30:00 +0000 https://status.hyperar.com/incident/370197#f728ed9d6d6ddc4a62b2132ea0fea3679382ba05a76e2b0f00fd224ff32cb809 Maintenance completed Minor infrastructure update https://status.hyperar.com/incident/370197 Thu, 16 May 2024 10:11:00 -0000 https://status.hyperar.com/incident/370197#df1a5b4e970ea29095b108e060f21880070dd4d13bed06c0e8832f54f35367f0 A small change Continuation of Release https://status.hyperar.com/incident/369207 Tue, 14 May 2024 15:00:00 +0000 https://status.hyperar.com/incident/369207#f73c382ac270f08be7f0a6fa65d2bbec470d7f7e24257418d73ca71dfbcaaa1b Maintenance completed Continuation of Release https://status.hyperar.com/incident/369207 Tue, 14 May 2024 13:08:00 -0000 https://status.hyperar.com/incident/369207#b2c2f638e9250f51f33991f8c862418e14530915ec79b40765419996a098f14f A continuation of the release started this morning Planned Release https://status.hyperar.com/incident/369029 Tue, 14 May 2024 09:00:00 +0000 https://status.hyperar.com/incident/369029#b22ee91d0aad69df6b8cd132d075d6ca365bd1a60f282d62e76345e2f9fb708f Maintenance completed Planned Release https://status.hyperar.com/incident/369029 Tue, 14 May 2024 07:00:00 -0000 https://status.hyperar.com/incident/369029#3357b0df4dd994c75fe0dfdc6d39ab95fd783fb42f1c3d780f7208908917b497 Planned major release Planned release https://status.hyperar.com/incident/362664 Wed, 01 May 2024 10:00:00 +0000 https://status.hyperar.com/incident/362664#0bc71da242bd95cdb78ef9fea99b103ef7d26b32deb4490893537a1b9908fbb9 Maintenance completed Planned release https://status.hyperar.com/incident/362664 Wed, 01 May 2024 09:30:00 -0000 https://status.hyperar.com/incident/362664#dff1476a5378dafdb03e56a2e18bf7857a2f224bb7c14de181767955dbdbc1d8 We are releasing updates to the platform. Terraform Refresh https://status.hyperar.com/incident/333215 Wed, 28 Feb 2024 15:50:00 +0000 https://status.hyperar.com/incident/333215#f5af37385091b12b071894a961a9864e21f634c20f958126dcd9c38b1fa3dfe5 Maintenance completed Terraform Refresh https://status.hyperar.com/incident/333215 Wed, 28 Feb 2024 15:10:00 -0000 https://status.hyperar.com/incident/333215#aa410e6841ef7f1099e63ec9543e45ce007ae8cda956e4deadee11ad5c188627 Terraform is becoming out of sync with reality therefore a refresh of terraform state is required. This has so far been successfully performed with all three pre-production environments. Downtime is not expected. Hyper Portal service outage https://status.hyperar.com/incident/259628 Wed, 23 Aug 2023 09:49:00 -0000 https://status.hyperar.com/incident/259628#11e01309e926ba8633e9798d23beea52dc140e533a495560086e888440d3126a ### 1. Incident Overview The Hyper Portal was affected by a service outage serving HMDFs to the client SDKs. The issue was caused due to an issue pulling dependencies from NPM during an automated instance restart. The incident occurred at 09:19 BST and was reported by a customer at 09:24 BST on Wednesday 23rd August and the Web Team were notified immediately. At 09:37 BST, the customers were informed that the issue was being investigated. The cause was identified and the issue was resolved by 10:41 BST. ### 2. Incident Details AWS Elastic Beanstalk automatically scales instances within the deployment. In this case it deployed a new instance which triggered a re-download of the application dependencies. A dependency failed to be installed causing the server launch to fail. The service was subsequently unavailable, responding to requests with a 502 Bad Gateway. ### 3. Incident Response The Web Team responded immediately by diagnosing the problem. The issue was resolved by updating the dependency list of the application and initiating a redeployment at 10:39 BST with restoration of services at 10:41 BST. The redeployment caused the application to re-download its dependencies and start up correctly. ### 4. Root Cause Analysis 1. What caused the new instance to fail in installing the dependency? 1. The dependency was not sufficiently configured and a 3rd party change caused the dependency to break 2. What changed? 1. NPM installed dependencies in a relatively non-deterministic manner, the behaviour of which is constrained by configuration. A change in an upstream dependency caused the resulting dependency tree to change such that the service became misconfigured. 3. What fixed it? 1. The issue was fixed by updating the dependency configuration and redeploying the service ### 5. Impact & Consequences The impact resulted in a Priority 1 (P1) outage that affected customers. The overall impact of the outage was a loss of map service for the SDKs, causing the mapping experience to be unable to function. ### 6. Lessons Learned - The alarms were not sufficient and requires investigation into how and why this was not known about sooner. The alarm reported that it was OK. We suspect this is not the case and need to investigate to confirm the exact time. - ESLint configuration was not sufficient in order to bring this issue to light before deployment. ### 7. Preventive Action The Hyper Portal is currently being migrated to ECS, in which the application is pre-built and stored in an image registry, meaning that the dependencies will not be required for deploying the application beyond the initial build. AWS uses an automated process for monitoring the health of instances. We will review the configuration in AWS to understand if we can adjust parameters around what is considered as an unhealthy instance to avoid any unnecessary redeployments. We will set up an external health check service to coarsely monitor service availability. Configure ESLint rule `eslint-plugin-import` to mitigate against future possible occurrences of this issue. ### 8. Incident Closure Internal confirmation of services being back online at 10:47 BST followed by external confirmation that the downstream services were functioning as expected again at 10:49 BST. Hyper Portal service outage https://status.hyperar.com/incident/259627 Thu, 03 Aug 2023 09:15:00 -0000 https://status.hyperar.com/incident/259627#3c8cd2aeda72e07f444b260470b6190df963aee3d800a45d4afa32d73a410a1a ### 1. Incident Overview The Hyper Portal was affected by a service outage serving HMDFs to the client SDKs. The issue was caused due to an expired NPM token on the Hyper Portal server account that the API uses for reading HMDFs from S3. The incident was reported via our internal team at 09:07 BST on Thursday 3rd August. The Web Team were notified immediately where the cause was identified and resolved by 09:57. During the outage, the issue was further raised by a customer at 09:19 BST, whilst we were working through the problem. A response to customers was given within 20 minutes as to the cause and timeline to remediate. ### 2. Incident Details The issue was caused due to an edge case involving an expired NPM token, used for building the application, following cycling tokens. AWS Elastic Beanstalk automatically scales instances within the deployment. In this case it deployed a new instance using an out of date build that referenced the expired token. The application was unable to successfully deploy and the service was subsequently unavailable, responding to requests with a 502 Bad Gateway. ### 3. Incident Response The Web Team responded immediately by diagnosing the problem and providing a fix and timeline within 15 minutes of the alert. The issue was resolved by ensuring that the updated token was referenced within the deployment, and then initiating a redeployment of the application at 9:57 am. ### 4. Root Cause Analysis The root cause is due to the way that NPM tokens are referenced within deployments. When a deployment is created it references the token that is available at that time. When the same build redeploys (which can happen for a number of reasons due to the auto scaling behaviour of the infrastructure) it uses the token that it was given during the initial deployment, rather than using a new token that may have (and in fact was in this case) updated. ### 5. Impact & Consequences The impact resulted in a Priority 1 (P1) outage that affected customers. The overall impact of the outage was a loss of map service for the SDKs, causing the mapping experience to be unable to function. ### 6. Lessons Learned 1. The alarms were not sufficient and requires investigation into how and why this was not known about sooner. The alarm reported that it was OK. We suspect this is not the case and need to investigate to confirm the exact time. 2. The redeployment occurred at out of working hours where we had no ability to act to rectify the problem. 3. We noticed it by co-incidence internally via our Head of Growth whilst using the Portal to map a store. Once he refreshed the page he was presented with a 502 error. This was also tested on mobile with the same error. 4. Recommend to cycle the tokens since they were linked to one employee account. Any changes to internal users will have no impact on service availability. ### 7. Preventive Action The Hyper Portal is currently being migrated to ECS, in which the application is pre-built and stored in an image registry, meaning that the token will not be required for deploying the application beyond the initial build. AWS uses an automated process for monitoring the health of instances. We will review the configuration in AWS to understand if we can adjust parameters around what is considered as an unhealthy instance to avoid any unnecessary redeployments. NPM tokens should be generated form a shared admin account so that they are not required to cycle when an employee leaves the company. ### 8. Incident Closure The incident was closed at 10:15 BST upon confirmation from the customer that the issue was no longer occurring. HMDF Downloads unavailable https://status.hyperar.com/incident/259635 Thu, 25 May 2023 09:15:00 -0000 https://status.hyperar.com/incident/259635#345b42c385d2ee83e74a3dc5aa684c69566545f2266b56cf5e43fc32a779bc8b ### 1. Incident Overview The Hyper Portal was affected by a service outage serving HMDFs to the client SDKs. The issue was caused due to a permissions change on the Hyper Portal server account that the API uses for reading HMDFs from S3. The incident was reported by a customer at 07:11 BST on Friday 19th May, the issue was picked up internally by Hyper at 09:12 BST 19th May. The cause was identified and resolved by 10:02 BST on Friday 19th May. ### 2. Incident Details The SDKs make requests to the HMDF endpoint of the Hyper Portal API, which is currently hosted on AWS Elastic Beanstalk, the API requests the HMDF from an S3 bucket before returning this to the the client SDK. The issue caused by the unintended removal of the Hyper Portal user's S3 read permission caused the API to not be able to read HMDFs from the S3 bucket resulting in error responses on the API when requesting HMDFs. This in turn caused any SDK clients to be unable to request new HMDFs. The Hyper Portal API logs are recorded using Cloud Watch with a 2 week retention period. The issue was logged in the server logs with a message: > [HMDFService] Access Denied to the HMDF S3 storage bucket. The first instance of an event log mentioning this was logged at 17:29 BST 18th May. The Hyper Portal API health is recorded in the AWS Elastic Beanstalk Events Log with a retention duration of 2 weeks. The Hyper Portal entered into the Warning state at 18:33 BST 18th May, it then first entered into the Degraded state at 22:28 BST 18th May. There were a number of transitions between Degraded, Warning, and Okay until it was resolved. ### 3. Incident Response 09:12 BST 2023-05-19 Hayley Hinsley (Mobile PM) responds to the customer report. 09:12 BST 2023-05-19 Hayley Hinsley (Mobile PM) notifies the teams responsible for the Hyper Portal and SDKs 09:13 BST 2023-05-19 Chris Rivers (Mobile Lead) identifies the API is responding with an error 09:28 BST 2023-05-19 Harry Jarman (Web Engineer) investigates the Hyper Portal logs 09:40 BST 2023-05-19 Harry Jarman (Web Engineer) notifies Hyper team that the cause has been identified 09:56 BST 2023-05-19 Harry Jarman(Web Engineer) applies the fix and notifies Hyper team that the issue had been resolved 09:57 BST 2023-05-19 Chris Rivers (Mobile Lead) confirms the fix 10:02 BST 2023-05-19 Hayley Hinsley (Mobile PM) communicates to customer that the issue has been fixed 10:15 BST 2023-05-19 Customer confirms the issue is resolved Hyper AR Ltd ### 4. Root Cause Analysis The issue was caused by the unintended removal of the Hyper Portal API server account's access to read from the HMDF S3 bucket due to human error. ### 5. Impact and Consequences The overall impact of the outage was a loss of map service for the SDKs, causing the mapping experience to be unable to function. ### 6. Lessons Learned Two main process improvements have been identified resulting from the incident. Firstly, it has been identified that the issue was not observed by Hyper until the morning when the team arrived for work. Secondly, that it should not be easy for a developer to alter the server permissions unintentionally. ### 7. Mitigations for Preventing Future Occurrences In order to mitigate this issue in future, a number of improvements have been implemented: - Hyper Portal alerts have been created to monitor the service health of the Hyper Portal. - Hyper Portal alerts have been set up to notify a shared email that relays messages to: - Neil Thomson (Web Lead) - Harry Jarman (Web Engineer) - Greg Sims (Web Engineer) - Andrew Hart (CEO) - Hyper Portal alerts have been set up to post messages in Hyper's internal Slack channel - A specific tag has been applied to all Hyper Portal AWS resources. An IAM policy has been applied to all users, except Neil Thomson (Web Lead), Harry Jarman (Web Engineer), Greg Sims (Web Engineer), that prevents them from being able to interact with these resources. Future improvements and recommendations - A policy allowing modification of Hyper Portal resources will be created and will be accessible by assuming a role explicitly on a time limited basis - this will then be applied to all users, with no exceptions. - A policy for requesting to be able to assume the above role will be defined and requests by Hyper employees to have access for modification of Hyper Portal Resources will be logged and be required to be approved. ### 8. Incident Closure The incident was closed at 10:15 BST upon confirmation from the customer that the issue was no longer occurring.