• David Guillot@discuss.tchncs.de
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    15 hours ago

    Hi @danielquinn@lemmy.ca , author here.

    Thanks for taking the time for such an in-depth comment 👍 . I’m gonna try to answer point by point.

    On different configurations and settings

    Although I strongly agree with your take (from your recent blog post) that if ENVIRONMENT == "prod": should never happen, I’m really surprised about this:

    Your configuration & settings should be identical between environments

    I think there’s a misunderstanding here. Here’s how I see it:

    • A “setting” is a constant your application needs at runtime to make a decision about its behavior
    • A “configuration” is a way of running your application, with some “settings” preset and some “settings” required from the “environment”
    • An “environment” is a place where your application runs under a certain “configuration” that you’ve chosen, and this “environment” provides the values for the “settings” your chosen “configuration” requires

    This way of thinking enables you to have multiple environments running the same configuration with different values for some settings.

    As for:

    how exactly to do expect to find & fix problems that only occur in production?

    From my experience, 99% of the time, problems that only occur in production are all about data and edge cases, not about configuration. And I’ve learned to avoid making the whole structure of my application constrained by 1% of the cases.

    On secrets-in-repo

    To be honest I have no experience with secrets-in-repo, and I am seeking feedback, but:

    secrets in the repo (…) requires a code change (+CI run, +deploy) to change any of the values.

    How changing of .env file is considered a code change? Of course it triggers a deploy, but it should not trigger a CI run, as the application behavior did not change, only some envvar in some environment. Changing an .env file is exactly as changing an envvar through your cloud provider console, with the only difference that the envvars are yours, which is pretty important to me.

    instead of just one environment to configure, you have potentially dozens: production, staging, testing, and one for every developer who’s ever worked there

    This has nothing to do with storing secrets in the repo, but everything to do with how many environments you have. I think many teams like to have at least one staging environment, don’t you think?

    how they might have stored the keys in plain text on their machine

    I don’t see any difference with the infamous ~/.aws/credentials file that we all have on our work computer, that allows an attacker to fetch all secrets from all environments with the right AWS API calls. Yes, a developer who gets their laptop stolen and who did not encrypt the disk represents a huge risk to their company.

    When Steve gets fired, how many files do you have to decrypt, edit, and re-encrypt to rotate the secrets?

    Just the ones that are non-development, typically production and staging. Steve did not have the keys to other files, this is the whole point of the strategy I chose of having 2 keys (one individual, one shared). To be honest, even for not-so-small teams, this would not be a long task. Also, when I join a team or when I build a team, I don’t expect people getting fired too often 😅

    All that might be acceptable if the benefits were high, but they aren’t.

    I guess that’s subjective then. Because IMO, being independent from cloud providers when it comes to handling my applications secrets is a great benefit. Maybe I’m wrong on the risks/benefits ratio between the independence benefits and the security risks, but nothing in your comment convinces me that storing secrets in the Git repo is less safe than storing them in a vault that I access with a key that I have on my laptop.

    On licensing

    I’m sorry but unless you point me to legal sources, I think you’re wrong about what the GPLv3 means ; I’ve studied this topic carefully before choosing that licence because it’s my first open source piece of code.

    No, using a GPLv3 web project template as base of a new web project doesn’t make the larger project a “derivative work” that must also be published under GPLv3, for two reasons:

    • What makes the GPLv3 license “viral” is when a program “links” to a GPLv3 licensed library ; using a project template does not “link” (in a software sense) the newly created project to the template (i.e. there will be no from mytemplate import something in your Django project, thus your project won’t be considered as “derivative work”)
    • It’s the “distribution” of the covered software (modified or unmodified) that triggers the obligation of publishing the source under GPLv3 ; remote execution (e.g. a web application) is not “distribution”, this is why AGPL exists. Examples of Django reusable apps published under GPLv3 exist (e.g. jazzband/django-invitations ; imagine if everybody using this had to release their own website under GPLv3?)

    In contrast, what would be considered as “derivative work” is someone enriching my Django project template in order to sell a standardized Django experience as part of their consultancy. They would have to release under GPLv3 the changes (fixes and new features) they would make to my template.

    I like your suggestion to look at Creative Commons licenses though, maybe it would make sense, thanks.

    • Daniel Quinn@lemmy.ca
      link
      fedilink
      English
      arrow-up
      1
      ·
      14 hours ago

      So let’s get the licensing bit out of the way first. I am 100% confident that you’re wrong on this. The GPL is a copyright license and like all copyright licenses, it applies to the work and your rights to copy it. If you choose to copy the contents of a GPL project’s code into your own project, the license dictates that you must license your project under the GPL. For example, if you were to build a new kernel for your own special operating system and copy out significant portions of the Linux kernel to do it, your new kernel will be covered by the GPL.

      You may be confusing the GPL with the LGPL here, which specifically has an exemption for linking. Under that license, you can link to a GPL project (it’s not clear if a Python import would qualify as this was originally written for external modules in C projects) without your project being covered by the GPL.

      You’re also misunderstanding “distribution” here. While it’s true that there’s a distinction between the GPL and AGPL in how this word is defined, it does not affect how the license applies. To use another example, the fuzzywuzzy project is GPL-licensed, so if you were to use it in your Django project, it would necessarily make your Django project GPL. However, as “distribution” under the GPL applies only to sharing copies of the project with others and not to services provided over the web, your project will be GPL, but you’ll be under no obligation to share the source with anyone unless you were to copy the project onto someone’s laptop. So long as your project is just a webserver sending HTML to the user, you’re under no obligation to share the source code for your server.

      The AGPL on the other hand includes accessing the software over a network under its definition of “distribution” and so if fuzzywuzzy were AGPL licensed, this would require you to publish your Django project’s source publicly.

      Source: I too have been reading heavily on this front for about 23 years, so much so that I married a copyright lawyer. We talk about this stuff a lot.

      Regarding the secrets in-repo, I’m not going to fight you on this. In my experience it’s a Great Big Pain In The Ass to manage these things and I think you may be overlooking just how many of the devs on your team may need the rights to read/write production values.

      As for the making the distinction between settings and configuration, again I think you’re going to live to regret this as every company I’ve started at that employs this pattern has. You simply can’t have your development, testing, and production environments running different middleware classes (as your example suggests) and not be due for a surprise in production. No, your settings should be as close to production in all environments as possible, and breaking your settings up like this is just begging for deviation.

      As for the claim that only 99% of problems in production are data-related, that too is not my experience with such systems. If you’re talking to S3 in production and local folders in development, or SQS in production and synchronous execution in development, you will have problems, and you won’t be able to detect them, let alone debug and fix them in an environment that doesn’t match the place you’re deploying to.