Skip to content

Commit 05c4e33

Browse files
authored
Merge branch 'main' into lfx-anushka
2 parents cde1de9 + 46e891e commit 05c4e33

4 files changed

Lines changed: 169 additions & 0 deletions

File tree

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
---
2+
title: "Chaos testing the CloudNativePG project"
3+
date: 2026-01-05
4+
draft: false
5+
image:
6+
url: yash.jpeg
7+
attribution:
8+
authors:
9+
- fdrees
10+
tags:
11+
- lfx
12+
- mentorship
13+
- kubernetes
14+
- postgresql
15+
- litmus
16+
- devops
17+
summary: "Meet the mentee: Yash Agarwal worked with the project maintainers on adding
18+
chaos testing to CloudNativePG, as part of the LFX mentorship program."
19+
---
20+
21+
In the summer we wrote about how CloudNativePG was back for the September-
22+
October-November LFX term with [several projects for mentoring](https://cloudnative-pg.io/blog/2025-term3-lfx-cncf-mentorship/). One of them was
23+
around Chaos Testing.
24+
25+
Yash Agarwal worked with mentors and CloudNativePG maintainers Gabriele Bartolini,
26+
Marco Nenciarini, Francesco Canovai, and Jonathan Gonzalez, to enhance the
27+
project's test coverage. Introducing LitmusChaos, a comprehensive chaos testing
28+
framework, the team designed automated chaos experiments for common failure
29+
scenarios, integrated them into CI/CD workflows, and collected observability
30+
metrics like failover time and data consistency. I had a chat with Yash about
31+
his work, and about how he got into Tech in the first place.
32+
33+
## Start at the beginning
34+
35+
Yash's venture into programming started when he got introduced to Python in 11th
36+
grade. He was always fascinated by technology, and got further inspired to pursue
37+
a career as a programmer by his cousin Amit, a software developer.
38+
39+
Today Yash is a full stack developer intern at Seeqlo, where he, among other
40+
things, focuses on streamlining cloud operations and optimizing performance.
41+
Based in Bengaluru, India, Yash is a member of Point Blank, a student-run tech
42+
community dedicated to learning together.
43+
44+
He looks back at working with the CloudNativePG team as a "great learning experience".
45+
They met twice a week for 30 minutes to discuss the progress of the project.
46+
One thing that Yash says he learned is to have more patience.
47+
48+
## Chaos testing
49+
50+
The new [chaos-testing repository](https://github.com/cloudnative-pg/chaos-testing) Yash worked on provides automated tools to validate
51+
PostgreSQL cluster resilience under failure conditions. It combines two testing
52+
approaches:
53+
54+
* Jepsen Consistency Testing - Uses the famous Jepsen framework to perform
55+
mathematical proofs of database consistency. It continuously runs database
56+
operations (50 ops/sec) and validates that no data is lost or corrupted during
57+
failures.
58+
* LitmusChaos Fault Injection - Uses LitmusChaos to simulate real-world failures
59+
by repeatedly deleting the PostgreSQL primary pod (every 60-180 seconds),
60+
forcing CloudNativePG to perform automatic failover.
61+
62+
You can read more about the project in the repository's [README](https://github.com/cloudnative-pg/chaos-testing/blob/main/README.md). And, in case
63+
you're curious, here's Yash's PR: https://github.com/cloudnative-pg/chaos-testing/pull/3
64+
65+
66+
## Contributing to Litmus itself
67+
68+
Yash wasn't able to find how to get the chaos engine to target the primary pods
69+
since the appKind CloudNativePG uses isn't natively supported by Litmus. "I tried
70+
many things, but when I tried AppKind as "cluster" it worked! I read the Litmus
71+
code and found that there were some validations which prevented "Cluster" (capital
72+
"C") from working. This behavior was not described in Litmus' documentation,
73+
which meant I could submit a PR and prevent the next person from running into
74+
the same issue!"
75+
76+
## What's next?
77+
78+
In the second half of his 3rd year, Yash is exploring opportunities in the field
79+
of backend and DevOps. "I will surely try to contribute more towards CloudNativePG
80+
when time permits!" You can follow Yash's work on [GitHub](https://github.com/XploY04).
146 KB
Loading
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
---
2+
title: "Sticking with Open Source: pgEdge and CloudNativePG"
3+
date: 2026-01-02
4+
draft: false
5+
image:
6+
url: pgedge_cloudnativepg.jpg
7+
attribution:
8+
authors:
9+
- fdrees
10+
tags:
11+
- helm
12+
- ImageCatalog
13+
- CNCF
14+
- kubernetes
15+
- postgresql
16+
- open-source
17+
summary: "We talked to Matthew Mols, Sr. Director of Engineering at pgEdge,
18+
about how CloudNativePG enables them to meet the requirements of their customers
19+
using just open source."
20+
---
21+
22+
[Matthew Mols](https://www.linkedin.com/in/mmols/) is the Sr. Director of Engineering at pgEdge, a team of engineers and
23+
entrepreneurs on a mission to make it easy to build, deploy and manage enterprise
24+
grade applications at scale on Postgres.
25+
26+
Recently pgEdge announced their [CloudNativePG integration](https://www.postgresql.org/about/news/pgedge-announces-cloudnativepg-integration-simplifying-postgres-on-kubernetes-3166/) and them joining the
27+
Cloud Native Computing Foundation (CNCF).
28+
29+
We had a chance to talk to Matt about their use of CloudNativePG.
30+
31+
## Why CloudNativePG works for pgEdge
32+
33+
Matt loves learning about customers' challenges with Postgres, and thinking about
34+
how they can build or suggest tools and approaches to make their lives easier.
35+
"I'm continually surprised by the different ways folks are leveraging Postgres
36+
in their businesses."
37+
38+
Matt's role is focused on developing tools that enable pgEdge's customers to
39+
effectively deploy Postgres, whether that’s on Kubernetes, VMs, bare metal, or in
40+
the Cloud. A lot of this work is centered around making it easier to use tools
41+
together to meet the requirements of different kinds of customers.
42+
43+
"We are fully committed to open source and tend to utilize a combination of open
44+
source extensions and tools that we've developed and released, like Spock for
45+
multi-master (active-active) logical replication, combined with stable community
46+
tools like CloudNativePG."
47+
48+
pgEdge uses CloudNativePG in their [Helm chart](https://docs.pgedge.com/pgedge-containers/), which allows users to deploy
49+
active-active databases into multiple Kubernetes clusters, and keep them in sync.
50+
51+
## Getting started with CloudNativePG
52+
53+
Before CloudNativePG, Matt and his team used other operators, and a mix of custom
54+
Helm charts that leveraged Kubernetes primitives to deploy Postgres instances.
55+
CloudNativePG's popularity and stability, and its acceptance into the CNCF,
56+
confirmed that it was the right choice to switch to as the default.
57+
58+
"Working with CloudNativePG has been really straightforward for us since we've
59+
moved to it exclusively. In particular, the documentation is very well done,
60+
with a combination of "start from here" examples, combined with in-depth guides
61+
for every feature. Deploying Postgres comes with a lot of choices on specific
62+
configuration, and it does a great job of laying out why you would choose from
63+
one option or the other, with sensible defaults."
64+
65+
Access to stable Service endpoints that point to the current primary instance
66+
is the thing Matt mentions as "one of the most valuable aspects of deploying a
67+
CloudNativePG cluster". Matt: "Outside of Kubernetes there are many tools you need
68+
to integrate correctly to give that guarantee to applications and integrations."
69+
pgEdge leverages these stable services to enable distributed databases across
70+
multiple CloudNativePG clusters in different Kubernetes clusters, while relying
71+
on automatic failover with standby instances in each region.
72+
73+
In terms of roadmap, Matt is particularly excited about the recent introduction
74+
of dynamic loading of PostgreSQL extensions, and some of the upcoming work to
75+
extend that to the ImageCatalog CRD. "As Postgres has embraced containerization
76+
more and more, this has been a challenging area to navigate, with growing image
77+
sizes, dependency management headaches, and adherence to license requirements.
78+
In particular, this is going to go a long way towards improving how we manage
79+
supply chain risk in the Postgres community."
80+
81+
## What's next?
82+
83+
Matt looks forward to contributing back to the project in the future. "Our hope
84+
is to look to contribute more capabilities that enable distributed deployment
85+
with CloudNativePG, potentially as part of supporting the CNPG-I approach to
86+
plugins." The goal is to make it easier to operate active-active databases that
87+
span across multiple Kubernetes clusters, enabling better support for different
88+
types of multi-region deployments. "Our thought is that it's best done through
89+
CloudNativePG interfaces."
18.6 KB
Loading

0 commit comments

Comments
 (0)