Skip to content

Blog

Reimplementing the AWS EKS API with Clojure using BigConfig, Rama, and Pedestal

K8s

The world of cloud infrastructure often involves interacting with complex APIs. While services like AWS EKS provide robust management for Kubernetes clusters, there might be scenarios where you need a more tailored or localized control plane. This article will guide you through reimplementing the AWS EKS API using a powerful Clojure stack: Pedestal for the API, BigConfig to wrap Terraform and Ansible in a workflow, and Rama for state and jobs.

Before we dive into the how, let’s consider the why. K8s, Spark, ClickHouse, Postgres, and so on are all good candidates for an in-house software as a service. Reimplementing a cloud API might seem counterintuitive, but it can be beneficial for:

  • Avoiding vendor lock-in: This can be relevant for some companies.
  • Multi-cloud strategy: You need an EKS-like solution in multiple cloud providers and you need a generic API.
  • Saas: You have an open source software and the Saas is your source of revenue.
  • Metal: You cannot use the cloud but you want to provide the same developer experience in your company.
  • Integration costs: Buying EKS and integrating it with the rest of your infrastructure is not feasible or very expensive. Building an EKS-like solution is cheaper.

Disclaimer: This is a simplified blueprint for educational and experimental purposes. It will not cover the full breadth and complexity of the actual AWS EKS API.

Here’s a quick overview of the tools we’ll be using:

  • BigConfig: A workflow and a template engine that enables us to have a zero-cost build step before running any devops tool like Terraform or Ansible.
  • Rama: A distributed stream processing and analytics engine that can also function as a durable, highly concurrent data store. We’ll use Rama to manage our cluster definitions and state.
  • Pedestal: A comprehensive web framework for Clojure that emphasizes data-driven development and offers excellent support for both synchronous and asynchronous request handling. It will serve as our API gateway.

Let’s imagine the core entities we want to manage: EKS Clusters. For simplicity, we’ll focus on creating and describing clusters.

  • Reuse GitOps: Building a single K8s cluster with GitOps should be reuseable. Replacing GitOps with an API should not require to reimplement everything from scratch. The solution should contain a more generalized version of the GitOps one for one cluster.
  • Declerative when possible: Terraform should be used to create resources instead of the AWS APIs whenever it is possible.
Diagram
  • Pedestal API: to create and describe clusters.
  • Rama Module: to store the desired state, clone the repo, and invoke BigConfig with the desired state.
  • BigConfig Module: this is where the heavy lifting is happening:
    • Workflow: achieved the desired state will require multiple steps.
    • Lock: to be sure that changes are ACID.
    • Build: to generate the configuration files for Terraform based on the desired state.
    • Apply: to run terraform apply programmatically.
  • Modularity: Every deliverable can be developed in parallel by adopting contracts.
  • Uniformity: BigConfig, Rama, and Pedestal deliverables are written in Clojure.
  • Declerative: Creating an EC2 instance programmatically can be done faster with Terraform and we don’t need to worry about the life cycle management.
  • Reusability: The GitOps code can be reused. This is a killer feature. The code to provision one resource with GitOps or multiple resources with an API, doesn’t require to change from Terraform to the AWS SDK. The API is just a virtual admin.

I’m working on the code right now. Stay tuned, I will update the blog post as soon as I have the first version.

This is a basic example, but you can extend it significantly:

  • More EKS Features: Implement more aspects of the EKS API, such as node groups, Fargate profiles, or update operations.
  • Authentication and Authorization: Integrate with a robust authentication system to secure your API.
  • Error Handling: Implement more sophisticated error handling and meaningful error messages by adopting OpenTelemetry.

By combining BigConfig, Rame, and Pedestal, we’ve built a foundation for a custom EKS-like API in Clojure. This approach provides a high degree of control, flexibility, and the ability to tailor your infrastructure management precisely to your needs. This project serves as an excellent starting point for exploring the potential of building custom cloud-native services with Clojure.

Would you like to have a follow-up on this topic? What are your thoughts? I’d love to hear your experiences.

The killer feature of BigConfig

Killer Feature

For anyone working with Infrastructure as Code (IaC), managing configurations and deployments efficiently is key. Engineers are constantly seeking ways to enhance their workflows. Today, we’re diving into a powerful combination: OpenTofu and BigConfig, highlighting a killer feature that makes your build step practically invisible!

IaC tools like OpenTofu (an open-source alternative to Terraform) empower teams to define, provision, and manage infrastructure through code. However, as projects scale, especially in complex environments, the build and deployment process can become a multi-step chore. This often involves:

  • Git checks: Ensuring your working directory is clean and up-to-date.
  • Lock acquire: Making sure that changes are apply in order and incrementally.
  • Execution: Iterate on the infracoding until it works.
  • Git pushes: Committing changes back to your repository if the change is successful
  • Environment-specific deployments: Handling different configurations for different environments like staging and production.

This manual orchestration can be time-consuming and prone to errors.

Enter BigConfig: Simplifying Complex Workflows

Section titled “Enter BigConfig: Simplifying Complex Workflows”

BigConfig is a fantastic tool designed to encapsulate and automate these complex command sequences. It allows you to define a series of steps and execute them with a single command. Think of it as a smart wrapper for your common IaC operations. By centralizing these tasks, BigConfig significantly reduces cognitive load and improves consistency.

The Killer Feature: An Invisible Build Step with a Shell Alias

Section titled “The Killer Feature: An Invisible Build Step with a Shell Alias”

Here’s where the magic truly happens! By combining OpenTofu, BigConfig, and a simple shell alias, we can create an invisible build step. Imagine replacing a series of manual operations with just one, familiar invocation.

Consider this powerful shell alias:

Terminal window
alias tofu="bb build git-check lock exec git-push unlock-any -- alpha prod tofu"

Let’s break down what this alias does:

  1. alias tofu="...": This redefines your tofu command for the session. Now, whenever you type tofu, it executes BigConfig instead of tofu. Every step of the workflow is executed only if the previous step is successful.
  2. build: This is the BigConfig step to initiate a build process and achieve DRY like Atmos. If this fails, there is not reason to proceed. That’s why this step is always present and it is the first step.
  3. git-check: build should not make the Git working directory dirty and git-check ensures your Git working directory is clean and up to date.
  4. lock: It then acquires a lock for the module alpha and the profile prod, preventing concurrent changes from other developers.
  5. exec: This is the core execution step, where BigConfig will run your OpenTofu commands. In case of failure of exec, the workflow will stop and the next steps will not be executed. In particular the lock will not be released and the changes will not be pushed so that the developer can fix or revert the change.
  6. git-push: This automatically pushes the just applied change to your Git repository. You should be always one commit ahead of origin when you make changes.
  7. unlock-any: This ensures that any locks are released. any means that the owner is ignored. This step can be used alone if another developer forget to release the lock.
  8. -- alpha prod tofu: -- is the separator between the workflow definition and module, profile, and the shell command, in this case tofu.

A simple alias is now extending the capabilities of OpenTofu. Now OpenTofu has the capabilities of Atlantis and Atmos. But BigConfig is not specific to OpenTofu and it can be used also with Ansible, K8s, and your dotfiles.

Before: OpenTofu was not enough and other tools were required like Atlantis and Atmos.

After: Any DevOps tool can be augmented to have the capabilities of Atlantis and Atmos.

The sequential workflow, with all its checks, locks, and pushes, becomes completely invisible to the user. You interact with OpenTofu as you normally would, but all the surrounding boilerplate is handled automatically by BigConfig.

  • Increased Productivity: Engineers can focus on writing IaC, not on the deployment mechanics.
  • Reduced Errors: Automated checks and consistent execution minimize human error.
  • Standardized Deployments: Ensures that every deployment follows the same robust process.
  • Faster Onboarding: New team members can quickly get up to speed without memorizing complex sequences.

If you’re using OpenTofu and looking to streamline your IaC workflows, exploring BigConfig and implementing a similar shell alias is highly recommended. It’s a small change that yields massive benefits, transforming your change process from a visible chore into an invisible, seamless part of your development process. Happy infrastructure building! 🚀

Are you still using Atlantis or Atmos? What are your thoughts? I’d love to hear your experiences.

Why I have replaced Atlantis with BigConfig

Atlantis

As a long-time infrastructure enthusiast, I’ve had my share of dalliances with various tools and workflows. For a good while, Atlantis was my reliable partner in managing Terraform deployments. It brought order to the chaos of collaborative infrastructure-as-code, and for that, I’ll always be grateful.

However, like many relationships, sometimes you just grow apart. And in the rapidly evolving world of DevOps, staying stagnant means falling behind. So, after much deliberation, I’ve decided to move on from Atlantis for my Terraform needs, and I want to share why.

The Honeymoon Phase: What Atlantis Did Well

Section titled “The Honeymoon Phase: What Atlantis Did Well”

When Atlantis first arrived on the scene, it was a revelation. It solved a very real problem: how to bring a GitOps-like workflow to Terraform.

  • Pull Request Driven Workflow: This was Atlantis’s killer feature. The ability to run terraform plan and terraform apply directly from a pull request, with comments showing the output, was incredibly powerful. It made code reviews for infrastructure changes intuitive and collaborative.
  • Centralized State Management: By running within a controlled environment, Atlantis helped ensure that Terraform state was managed consistently and securely.
  • Concurrency Control: It prevented multiple users from running conflicting terraform apply commands on the same project, which was a lifesaver for team collaboration.
  • Simplicity of Setup (Initially): Getting Atlantis up and running wasn’t overly complex, especially for smaller teams.

For years, Atlantis was a solid choice, and it undoubtedly improved the way many teams managed their Terraform.

The Cracks Begin to Show: Where Atlantis Fell Short

Section titled “The Cracks Begin to Show: Where Atlantis Fell Short”

As my software engineering skills matured, some of Atlantis’s anti-patterns started to become more apparent.

  • Pull Request Driven Workflow: Atlantis, by design, has a somewhat opinionated workflow around the PR. But this is an anti-pattern if you follow David Farley’s principles of modern software engineering. The Trunk Based Development is incompatible with GitOps.
  • Hard to upgrade to an API or to a PaaS: While this is great for getting started, it becomes an obstacle if you want to upgrade your platform to an API or to a PaaS. The developer experience decreases because you end up with two solutions: one for K8s and for AWS/GCP/Azure resources.
  • The approval becomes becomes ineffective: The PR approach is an ineffective guard rail. Most of the reviews are not catching bugs before ending up in production.

The Modern Alternative: Why I’m Moving On

Section titled “The Modern Alternative: Why I’m Moving On”

So, how BigConfig is solving these limits of Atlantis?

  • Workflow client side: BigConfig has a cli that provides you with a workflow without a server side to coordinate changes made by multiple developers and to make sure that they are always applied sequentially.
  • It’s a library: The cli is also a library that can be used in any Clojure web service library like Pedestal to upgrade your infracoding to an API or to a PaaS.
  • Better developer experience: Let’s be honest, terraform is fondamentally an interactive tool. The scenario where a developer can write a non-trivial change in one go and then run it in Atlantis without bugs is rare. Most of the time we develop the terraform code incrementally with a sequence of plan and apply and the PR process increases the delivery time.

The Verdict: Atlantis is not compatible with modern software development

Section titled “The Verdict: Atlantis is not compatible with modern software development”

Let me be clear: Atlantis isn’t a “bad” tool. For many teams, especially those that are ok with the limits of the GitOps approach, it can still be a fantastic entry point. But for me, modern software development principles should be applied to both developement and operations.

In Berlin, where innovation moves at a rapid pace, staying agile means constantly re-evaluating your toolchain. My decision to move on from Atlantis is a reflection of that. I’m excited about the possibilities that BigConfig offers, allowing me to focus more on building reusable infrastructure.

Are you still using Atlantis? What are your thoughts? I’d love to hear your experiences.

Forward Deployed Engineering ❤️ Open Source

Merchant of complexity

Forward Deployed Engineering and Open Source as an alternative to Saas and Professional Services.

Section titled “Forward Deployed Engineering and Open Source as an alternative to Saas and Professional Services.”

In the evolving landscape of software, SaaS (Software as a Service) has long been the dominant model, offering convenience and accessibility. However, I expect a new paragigm to emerge: Forward Deployed Engineering for Open Source projects. This approach could challenge SaaS in the future by offering a compelling alternative that prioritizes control, customization, and community.

What is Forward Deployed Engineering declined for Open Source?

Section titled “What is Forward Deployed Engineering declined for Open Source?”

At its core, Forward Deployed Engineering declined for Open Source means that the engineers developing an Open Source project are actively involved in its deployment and operation within a user’s specific environment. Unlike SaaS, where the vendor manages everything, this model empowers users to run and even modify the software on their own infrastructure, with direct support and collaboration from the project’s core team.

Think of it as having the creators of the software as part of your extended team, helping you integrate, optimize, and troubleshoot it directly within your unique setup.

The SaaS Conundrum: When Convenience Comes at a Cost

Section titled “The SaaS Conundrum: When Convenience Comes at a Cost”

SaaS offers undeniable advantages: quick setup, automatic updates, and reduced operational overhead. But these benefits often come with significant trade-offs:

  • Vendor Lock-in: Migrating away from a SaaS provider can be notoriously difficult and costly, leading to reliance on a single vendor.
  • Limited Customization: SaaS solutions are designed for a broad audience, meaning deep customization to fit specific, niche requirements is often impossible or prohibitively expensive.
  • Data Control and Security Concerns: Users surrender control over their data to the SaaS provider, raising concerns about privacy, compliance, and security.
  • Opaque Costs: While seemingly straightforward, SaaS costs can escalate with usage, features, or user count, leading to unpredictable budgeting.
  • Lack of Transparency: The inner workings of a proprietary SaaS solution are a black box, making it difficult to diagnose issues or understand performance bottlenecks.
  • Merchant of complexity: Eventually all Saas becomes merchant of complexity.

How Forward Deployed Engineering Offers a Superior Alternative

Section titled “How Forward Deployed Engineering Offers a Superior Alternative”

For many organizations, particularly those with complex needs, strict security requirements, or a desire for ultimate control, Forward Deployed Engineering for Open Source projects offer a powerful alternative to SaaS.

  1. Ultimate Control and Ownership: With forward deployment engineering, you own the solution and the data. It runs on your infrastructure, giving you complete control over security, compliance, and data residency. There’s no vendor lock-in; you can switch providers or even manage it entirely in-house if you choose.

  2. Deep Customization and Flexibility: Open Source projects are inherently modifiable. Forward Deployed Engineering amplifies this by bringing the project’s experts directly to your environment. This allows for unparalleled customization, integration with existing systems, and the ability to tailor the software precisely to your unique workflows and requirements.

  3. Enhanced Security and Transparency: Running software on your own infrastructure with the help of the project’s engineers allows for greater control over security protocols. Furthermore, the Open Source nature means the code is auditable, providing transparency and reducing the risk of hidden vulnerabilities.

  4. Cost-Effectiveness at Scale: While initial setup might require more effort than a SaaS solution, the long-term costs can be significantly lower, especially at scale. You avoid recurring subscription fees that increase with usage, and you have the flexibility to optimize your infrastructure costs.

  5. Direct Collaboration and Community Benefits: Forward Deployed Engineering fosters a strong collaborative relationship between the user and the Open Source project team. This often leads to direct feature requests being implemented, bugs being squashed faster, and a stronger sense of community ownership over the software’s direction. Your operational challenges directly inform the project’s development.

When is Forward Deployed Engineering the Right Choice?

Section titled “When is Forward Deployed Engineering the Right Choice?”

Forward Deployed Engineering isn’t for everyone. Organizations that benefit most typically include:

  • Enterprises with complex IT landscapes: Those needing deep integration with existing systems.
  • Companies with strict regulatory or security requirements: Industries like finance, healthcare, or government.
  • Teams seeking maximum control and customization: When off-the-shelf solutions don’t quite fit.
  • Organizations with in-house technical talent: While the project engineers assist, some internal expertise is beneficial.

While SaaS will undoubtedly remain a popular choice for many, Forward Deployed Engineering for Open Source projects represents a powerful shift towards user empowerment and open collaboration. It’s not about outright replacing all SaaS, but rather providing a robust and often superior alternative for those who demand more control, transparency, and customization.

As Open Source projects continue to mature and offer enterprise-grade solutions, the model of Forward Deployed Engineering will likely become a cornerstone of how organizations adopt and leverage powerful software, shaping a future where the lines between vendor and user are increasingly blurred in favor of a collaborative, open ecosystem.

Would you like to have a follow-up on this topic? What are your thoughts? I’d love to hear your experiences.

Merchant of complexity

Merchant of complexity

When Your Vendor Profits from Your Inefficiency

Section titled “When Your Vendor Profits from Your Inefficiency”

Have you ever felt like a vendor is subtly, or not so subtly, making things more complicated than they need to be? You’re not alone. This phenomenon, which I like to call the “Merchant of Complexity,” describes a business model where a vendor’s profitability is directly tied to the inefficiency of your internal processes. It’s a cunning, often insidious, way for them to extract more money from you over time.

The core strategy of a Merchant of Complexity is to introduce or maintain layers of intricacy that require their continued, and often expensive, intervention. Here’s how they typically operate:

  • Proprietary Systems and Black Boxes: They might offer solutions that are deliberately opaque or built on proprietary technology, making it difficult for your internal team to understand, manage, or modify. This creates a dependency on their experts for even minor adjustments or troubleshooting.
  • Perpetual Consulting and Support: Instead of providing a truly intuitive and self-sufficient product, they create a constant need for their consulting services, training, and ongoing support. Each new feature, update, or integration becomes an opportunity for them to bill for their time.
  • Fragmented Solutions: Rather than offering a comprehensive, integrated solution, they provide a series of disconnected modules or products. This forces you to spend more time and resources integrating these disparate parts, often with their assistance.
  • High Switching Costs: They might make it incredibly difficult to migrate away from their services. This could involve convoluted data export processes, non-standard data formats, or contract clauses that penalize early termination. Once you’re in, you’re stuck.

The immediate financial impact of dealing with a Merchant of Complexity is obvious: higher invoices for services, support, and customizations. However, the true costs run much deeper:

  • Decreased Productivity: Your team spends valuable time navigating unnecessarily complex systems, waiting for vendor support, or trying to piece together fragmented information.
  • Stifled Innovation: The difficulty in adapting or integrating new technologies can slow down your ability to innovate and respond to market changes.
  • Increased Frustration and Morale Issues: Employees become demoralized when faced with constant roadblocks and inefficiencies caused by vendor-imposed complexity.
  • Loss of Control: You cede control over your own processes and data, becoming reliant on an external entity whose interests may not align with yours.

Identifying and Combating the Merchant of Complexity

Section titled “Identifying and Combating the Merchant of Complexity”

So, how can you spot a Merchant of Complexity and protect your business?

  1. Question Complexity: If a solution seems overly complicated for the problem it’s solving, challenge it. Ask for simpler alternatives or explanations.
  2. Demand Transparency: Insist on clear documentation, open APIs, and transparent pricing structures. Avoid “black box” solutions where you don’t understand how things work.
  3. Prioritize Self-Sufficiency: Look for vendors who empower your team to be self-sufficient through good design, comprehensive training, and accessible resources.
  4. Evaluate Total Cost of Ownership (TCO): Don’t just look at the initial price tag. Consider the ongoing costs of support, customization, training, and potential inefficiencies.
  5. Seek Integrated Solutions: Whenever possible, opt for vendors who offer holistic, integrated solutions that minimize the need for complex integrations.
  6. Read Contracts Carefully: Pay close attention to clauses related to data ownership, migration, and termination. Understand the switching costs before you commit.

In today’s fast-paced business environment, efficiency is paramount. Don’t let a Merchant of Complexity hold your business hostage to unnecessary intricacy. By being vigilant and asking the right questions, you can avoid these pitfalls and choose partners who truly contribute to your success, rather than profiting from your struggles.

Are you facing simirar problems? What are your thoughts? I’d love to hear your experiences.