Skip to main content

2 posts tagged with "cybersecurity"

View all tags

The Great Internet Outage of June 12, 2025 - A Lesson in Digital Fragility

· 4 min read
Joseph HE
Software Engineer

Thursday, June 12, 2025, will be etched in the annals as a day when the internet revealed its flaws. A widespread outage affected a wide range of popular services and websites, revealing the intrinsic vulnerability of a digital ecosystem increasingly reliant on a limited number of hosting giants.

The Fragility of Our Digital Ecosystem

This outage brutally highlighted how much our daily internet access relies on a handful of major players. As Tim Marcin of Mashable pointed out, this incident "paints a picture of the fragility of our internet ecosystem when essential cogs malfunction." It's clear that many commonly used services depend on a small number of large providers, and a malfunction at one of them can have significant cascading repercussions.

The names that repeatedly surface are well-known: AWS (Amazon Web Services), Google Cloud, Azure (Microsoft), and Cloudflare. The June 12 outage primarily involved Google Cloud and Cloudflare, demonstrating an interdependence that surprised even industry experts.

Google Cloud at the Heart of the Storm

At the center of this interruption was a problem with Google Cloud Platform (GCP). Google quickly acknowledged "problems with its API management system." Thomas Kurian, CEO of Google Cloud, issued an apology, confirming a full restoration of services.

What emerged from this situation was an unsuspected reliance of Cloudflare on Google Cloud. Long perceived as having an entirely independent infrastructure, Cloudflare revealed that some of its key services relied on GCP, particularly for a "long-term cold storage solution" linked to its Worker KV service. Initially, Cloudflare attributed the fault to Google Cloud, stating it was a "Google Cloud outage" affecting a limited number of its services.

The Cascading Impact of Cloudflare Worker KV

The Cloudflare Worker KV (Key-Value) service proved to be Cloudflare's Achilles' heel. Described as a "key-value store" and a "heart for tons of other things," its failure led to a cascade of incidents.

The outage lasted 2 hours and 28 minutes, globally impacting all Cloudflare customers using the affected services, including Worker KV, Warp, Access Gateway, Images, Stream, Workers AI, and even the Cloudflare dashboard itself. This situation clearly demonstrated that Worker KV is a "critical dependency for many Cloudflare products and is used for configuration, authentication, and asset delivery."

Transparency and Accountability: The Cloudflare Example

A remarkable aspect of this incident was Cloudflare's reaction in terms of transparency and taking responsibility. Although the root cause was attributed to Google Cloud, Cloudflare released an incident report with rare candor. Dane, Cloudflare's CEO, stated: "We let our customers down at Cloudflare today. [...] This was a failure on our part, and while the immediate cause or trigger of this outage was a third-party vendor failure, we are ultimately responsible for our chosen dependencies and how we choose to architect around them."

This attitude was widely praised as a corporate model, showing a "willingness to share absurdly high error rates" and the absence of "blame towards Google" in their report, proving a strong commitment to transparency.

Lessons Learned and Future Mitigation

Cloudflare quickly identified and began working on solutions. The incident report details a rapid timeline of detection and classification of the incident at the highest severity level (P0). The company plans to strengthen the resilience of its services by reducing single dependencies, notably by migrating Worker KV's cold storage to R2, their S3 alternative, to avoid relying on third-party storage infrastructures.

They are also working to "implement tools that allow them to gradually reactivate namespaces during storage infrastructure incidents," ensuring that critical services can operate even if the entire KV service is not yet fully restored.

The June 12, 2025 outage served as a brutal reminder of the web's increasing interdependence and the crucial importance of redundancy and diversification of dependencies, even for hosting giants. It compels us to re-evaluate the resilience of our digital architectures and strengthen collaboration among stakeholders for a more robust internet.

source:https://mashable.com/article/cause-internet-outage-google-cloud-what-happened-june-12

The Hidden Dangers of C - Unpacking Memory Management Risks

· 5 min read
Joseph HE
Software Engineer

The C programming language. It's often hailed as the "mother of almost all modern languages," forming the bedrock of everything from operating systems and compilers to game engines and encryption tools. Its power and low-level control are unparalleled, making it indispensable for critical infrastructure. Yet, this very power comes with a demanding responsibility: manual memory management.

Unlike languages with automatic garbage collection, C forces developers to "grow up and manage memory by yourself." This means allocating memory with malloc and diligently freeing it with free once it's no longer needed. This seemingly simple contract between malloc and free hides a minefield of potential pitfalls. Mishandling this responsibility can lead to catastrophic security vulnerabilities and system instability, often manifesting as "undefined behavior" – a programmer's nightmare where anything, from a minor glitch to complete system compromise, can happen.

Let's delve into some of the most common and dangerous memory management errors in C, illuminated by infamous historical incidents.

The Perils of C: Common Memory Management Risks

1. Buffer Overflows: When Data Spills Over

A buffer overflow occurs when a program attempts to write more data into a fixed-size buffer than it was allocated to hold. C, by design, doesn't perform automatic bounds checking. This lack of a safety net means if you write past the end of an array or buffer, you can overwrite adjacent data in memory, including critical program instructions or return addresses on the stack.

The consequences are severe: undefined behavior, program crashes, or, most dangerously, arbitrary code execution. A classic example is the Morris Worm of 1988. This early internet scourge exploited buffer overflows in common UNIX utilities like Fingered and Sendmail to inject malicious code, infecting an estimated 10% of the internet at the time. A simple conditional check on input size could have prevented this widespread chaos.

2. Heartbleed: A Lesson in Missing Length Checks

While a specific type of buffer overflow, the Heartbleed vulnerability (2014) in OpenSSL's heartbeat extension perfectly illustrates the danger of missing length validations. The server was designed to echo back a client's "heartbeat" message. The client would declare a certain message length and then send the data. The flaw? The server code didn't verify that the actual length of the received message matched the declared length.

Attackers could send a tiny message (e.g., "hello") but declare it as 64,000 bytes long. The server, trusting the declared length, would then read and return 64,000 bytes from its own memory, including the "hello" message plus an additional 63,995 bytes of whatever was immediately following the message in memory. This allowed attackers to passively leak sensitive data like private encryption keys, usernames, and passwords, impacting vast swathes of the internet.

3. Use-After-Free: Accessing Ghost Memory

This vulnerability arises when a program attempts to access a block of memory after it has been freed using free(). Once memory is freed, the operating system can reallocate it for other purposes. If a pointer still points to this now-freed (and potentially reallocated) memory, accessing it can lead to:

  • Crashes: If the memory has been reallocated and its contents changed, accessing it can cause the program to crash.
  • Data Corruption: Writing to reallocated memory can corrupt other parts of the program or even other programs.
  • Arbitrary Code Execution: An attacker might intentionally trigger a use-after-free, cause the memory to be reallocated with malicious data, and then exploit the old pointer to execute their own code.

The Internet Explorer 8 vulnerability (2013) demonstrated this. It involved JavaScript deleting HTML elements, but a pointer to the freed object persisted. An attacker could then craft a malicious webpage that would trigger the use-after-free, leading to system compromise by simply visiting the site.

4. Off-By-One Errors: The Tiny Miscalculation with Big Impact

Off-by-one errors are subtle mistakes in calculation, often involving loop boundaries or array indexing. In C, a common manifestation is forgetting to account for the null-terminating character (\0) when allocating space for strings. For instance, if you need to store a 10-character string, you actually need 11 bytes (10 for characters + 1 for \0).

These seemingly minor errors can lead to buffer overflows (writing one byte past the allocated end) or other out-of-bounds accesses, causing unpredictable behavior or opening doors for exploitation.

5. Double Free: Freeing What's Already Gone

Calling free() twice on the same block of memory is a "double free." This leads to immediate undefined behavior and can seriously corrupt the internal data structures used by the memory allocator (like malloc and free).

The implications are dire:

  • Program Crash: The program might crash immediately due to memory corruption.
  • Heap Corruption: The memory manager's internal state can become inconsistent, leading to unpredictable behavior later.
  • Arbitrary Code Execution: A sophisticated attacker can often manipulate the heap structures through a double free to achieve arbitrary read/write primitives, ultimately leading to remote code execution. When your code enters undefined behavior territory, "all bets are off."

Conclusion: The Unpredictable Nature of Undefined Behavior

The common thread running through these memory management errors is "undefined behavior." When your C code exhibits undefined behavior, the compiler and runtime environment are free to do anything. Your program might appear to work, it might crash, or, most terrifyingly, it might create a subtle vulnerability that an attacker can meticulously exploit to gain control of your system.

C's power is undeniable, but it comes with a non-negotiable demand for meticulousness in memory management. The historical incidents highlighted here serve as stark reminders that even a single oversight in handling malloc and free can have devastating, real-world consequences. Secure C programming isn't just about writing correct code; it's about anticipating and preventing every possible way memory can be mismanaged.