In recent years, I have learned how to build sufficiently robust systems fast.
Here are some things I have learned:
* Learn one tool well. It is often better to use a tool that you know really well than something that on the surface seems to be more appropriate for the problem. For extremely large number of real-life problems, Django hits the sweet spot.
Several times I have started a project thinking that maybe Django is too heavy, but soon the project outgrew the initial idea. For example, I just created a status page app. It started as a single file Django app, but luckily realized soon that it makes no sense to go around Djangos limitations.
* In most applications that fit the Django model, data model is at the center of everything. Even if making a rought prototype, never postpone data model refactoring. It just becomes more and more expensive and difficult to change over time.
* Most applications don't need to be single-page apps nor require heavy frontend frameworks. Even for those that can benefit from it, traditional Django views is just fine for 80% of the pages. For the rest, consider AlpineHJS/HTMX
* Most of the time, it is easier to build the stuff yourself. Need to store and edit customers? With Django, you can develop simple a CRM app inside your app in just few hours. Integrating commercial CRM takes much more time. This applies to everything: status page, CRM, support system, sales processes, etc. as well as most Django apps/libraries.
* Always choose extremely boring technology. Just use python/Django/Postgres for everything. Forget Kubernetes, Redis, RabbitMQ, Celery, etc. Alpine/HTMX is an exception, because you can avoid much of the Javascript stack.
While I agree with you, these two are the boring tech of 2025 for me. They work extremely reliably, they have well-defined use cases where they work perfectly and we know very well where they shouldn't be used, we know their gotchas, the interest around them seems to slowly wane. Personally, I'm a huge fan of these, just because they're very stable and they do what they are supposed to do.
A lot of the conversation about "simplicity" with kubernetes ignores that there are MANY different distributions of k8s.
I would argue that k3s on a vm is not too much harder to setup than a normal reverse proxy setup. And the configs are a bit easier to find on a whim when coming back after months.
Obviously a full self hosted setup of k8s is pretty large task but in other forms its a competitive option in the "simple" area.
Last time I tried what stumped me was getting "ReadWriteMany" volumes working.
My use case was pretty straightforward, a series of processing steps with the caveat that storing the actual results in a DB would be bad due to the blobs size. So, instead, the DB is used to signal downstream consumers files are ready and instead write the files in the ReadWriteMany volumes so that downstream files can simply read them.
I tried Longhorn, but even if I manage to get a volume showing up in the UI, it was never detected as healthy and after a while I refactored the workflow to use Apache Beam so that I could drop databases and volumes and run everything from within a single beefy VM.
Ya this kind of usecase is for sure straying into the "this is more of a pain than its worth" with regards to k8s. Esp when you can just do basic nfs mounts in nix anyways.
If you are already familiar with k8s volumes and csi's its not a huge problem but if you arent its not worth learning if your goal is simple. At least in my opinion.
Thanks. The Apache beam code in the end doesn't need a bucket but the HDD attached to the VM was more than enough to store everything and then moving things out of it as the final step.
I only used it once forever ago, and all I remember is being very confused about pod vs service and using a lot of magic strings to get load-balancing to work.
I mean a basic k3s install + a barebones deployment combined with a nodeport service is about as trivial as you can get. its not THAT much more "lines of config (60-70 max)" then an nginx conf + "git clone + other stuff" deploy script.
I dunno, a sane nginx config is like 30 lines max for a basic service. You can mange it entirely with git (`git config receive.denyCurrentBranch updateInstead`) and some post-receive / push-to-checkout hooks.
Keep in mind, for K8s to work you also need to containerize your application, even if it's just static files (or I guess you can break containment and do volume mounts, or some sort of static file resource if you're serving static content). For non-static applications, you still need to configure and run it, which is just harder with k8s/docker in the mix.
Agree. Maybe it should be more like "forget learning Kubernetes, Redis or anything you don't already" know since you don't want to bifurcate time between learning curve and product.
I also like Django a lot. I can get a working project up and running trivially fast.
In my day job I work with Go and while it's fine, I end up writing 10x more code for simple API endpoints and as soon as you add query parameters for filtering, pagination, etc. etc. it gets even longer. Adding a permissions model on top does similar. Of course there's a big performance difference but largely the DB queries dominate performance, even in Python, at least for most of the things I do.
I haven't tried Dropwizard. Is it a batteries included as Spring Boot? Eg: How is authentication support? Do we need a lot of boilerplate to implement let's say an OAuth2 resource server?
There's an almost pathological resistance to using anything that might be described as a 'framework' in the Go community in the name of 'simplicity'.
I find such a blanket opinion to be unhelpful, what's fine for writing microservices is less good for bootstrapping a whole SaaS app and I think that people get in a bit too much of an ideological tizz about it all.
In my experience frameworks (in contrast to "libraries") add complexity making your application harder to understand and debug once you get past the initial prototype.
Also, by their very nature they almost require you to write a program of a minimum size even when a simpler program would do to solve your problem.
> Also, by their very nature they almost require you to write a program of a minimum size even when a simpler program would do to solve your problem.
I'm not sure I agree, sure they require you to pull in a large dependency, and that might not be good for some use cases (particularly microservices) where you're sensitive to the final size of the image but for e.g. with Django you get spawned a project with settings.py which configures your database, middleware, etc. and urls.py which configures your routes, and then you're really free to structure your models and endpoints however you like.
It's always going to be more work with composable libraries since they don't 'flow'.
Just picking one of the examples I gave, pagination - that requires (a) query param handling (b) passing the info down into your database query (c) returning the pagination info in the response. In Django (DRF), that's all built in, you can even set the default pagination for every endpoint with a single line in your settings.py and write no more code.
In Go your equivalent would be wrangling something (either manually or using something like ShouldBindQuery in Gin) to decode the specific pagination query params and then wrangling that into your database calling code, and then wrangling the results + the pagination results info back.
Composable components therefore always leave you with more boilerplate
I guess there are pros and cons.
Pros of composable libraries is that you can more easily build upon them for your specific use case.
It also doesn't tie you to ORM usage.
You have to be responsible for "your" specific flow... meaning you can build your own defaults easily wrt parsing query parameters and whatnot (building a generic paginated query builder API?). Nothing insurmountable.
Fortunately the framework is pretty configurable on all this sort of stuff! Just to hit pagination again since it's the example I've used in other comments, in DRF you implement a custom pagination class:
That's the same as for most other configurable things. And ultimately if you want to override the behaviour for some specific endpoint then you can still easily do that by just implementing your own method for it.
After working in Go now for several years, what I've found is that generally people just don't some things in their first pass because it's too much work and makes the PRs too large, and then they retrofit it afterwards. Which meant that the APIs were less consistent than the ones I worked on in Django.
> No one in their right mind would say: just use the standard library but I've seen it online.
Yes but...sometimes the proper level of abstraction is simply a good HTTP library. Where the API of your application is defined in terms of URLs, HTTP methods, status codes, etc., the http library from the Go standard library might be all you need (depending on other requirements, of course).
A real life example: I needed a simple proxy with well defined behavior. Writing it in Go as a web application that forwarded calls to another service based on simple logic only took a few pages of code in a single file.
I don't dispute that.
But in general you need sessions, you need a few middleware to, for example, handle csrf or gzip compression of the response, auth, etc.
Telling people to juste use the standard library doesn't help.
Of course it is available. But often there is a need for more.
Look, I'm not for a moment suggesting that a proxy application would be a good fit for Django (or even Python). I'm talking about 'building a full stack web application in a reasonable amount of time, with auth, with a database, etc. etc.'
> No one in their right mind would say: just use the standard library but I've seen it online. That discourse is not helping.
I would say that.
The most important thing about standard library is its stability. You won't ever need to touch code that works with standard library. It's finished code. Other than bug fixes, of course.
Third-party libraries is a very different thing.
They gets abandoned all the time, so now you're left with burden. You either need to migrate to another library, maintain that abandoned library or live with huge chunk of code that might be vulnerable.
They gets changed often enough, as their developers probably not so careful about backwards compatibility, compared to core language developers.
Third-party library is a liability. Very rarely its source code is an ideal fit to your application. Often you'll use 10% of the library, and the rest is dead weight at best, vulnerability source at worst. Remember log4shell? Instead of using standard logging code in Java, some developer decides to pull log4j library which is very nice to use, has lots of features. It can even download and execute code behind your back, very featureful.
Of course I'm not advocating to rewrite the world. This is insane. Some problems are just too big to solve by yourself. I also should note, that different ecosystems have different approaches to the language library and overall library culture. JS is terrible, while Go is not that bad, but it's not ideal either.
But my absolutely 100% always approach is to use standard library first and foremost. I won't use third-party library just to save few lines of code or make code more "beautiful". I prefer dependency-free boring repetitive code any day.
And if I'm using third-party library, I'm very picky about its stability and transitive dependencies.
It also depends on kind of company. My experience has always been: you write some service, you throw it at production, it works for the next 20 years. So you want this code to be as self-contained as possible, to reduce "chore" time with dependency management. Perfect application is a dependency-free software which can be upgraded by changing "FROM" line in Dockerfile. It is stable enough that you can trust CI do that.
I don't think that everyone is capable of or should be implementing csrf protection or cors handling.
While the standard library is an awesome starting point, telling people that it is sufficient is not going to convince them.
I mean, I'm keen on a small number of dependencies too, but the smaller the scope of a package you go, I find the more likely they are to be abandoned. The Go JWT library has been forked twice because it became abandoned by the original authors, just to give an example.
Whats so hard about writing an SQL query and some json struct tags? I really dont like LLMs but all the pain points of Go you described above are relatively mute now if you use them for these little things
> Always choose extremely boring technology. Just use python/Django/Postgres for everything.
Hell, think twice before you consider postgres. Sqlite scales further than most people would expect it to, especially for local development / spinning up isolated CI instances. And for small apps it tends to be good enough for production too.
Sqlite is mostly boring but I've found that there's just slightly more risk of something going wrong because of the way it handles locking between threads. It has tended to misbehave under unexpected load and been difficult to fix in a way that Postgres hasn't.
I'm particularly thinking of workers running tasks, here. It's possible to lock up everything with write transactions or cause a spate of unhandled SQLITE_BUSY errors in cases where Postgres would just keep chugging along.
> especially for local development / spinning up isolated CI instances
I really don't consider it a good idea to use different databases on your production vs development vs CI instances. Just use PostgreSQL everywhere, it doesn't cost anything, scales down to almost nothing, and scales up to whatever you're likely to need in the future.
Local Postgres is easy. If you're deploying to some cloud service, Postgres is still probably easier than SQLite, since persistent local storage is a bit more special there. I'd also be worried that SQLite doesn't provide something I need out of the box.
I have had some bad experiences with Sqlite for local desktop apps with regards to memory usage, especially on MacOs.
Insert and delete a few thousands rows per hour, and over a few days your memory usage has ballooned. It seems to cause a lot of fragmentation.
Curious to hear more about your experience, since my impression from hacking around in native Apple software is that pretty much a bunch of it is based on top of sqlite3. The Photos app is a case in point I remember off the cuff.
I think we might be using it in a slightly unusual way: collect data for a while, do one big query once in a while to aggregate/transform the data and clean everything. Rinse and repeat as it's a background app.
So lots of allocations/deallocations. If you're only storing a few key/value pairs long term, you won't have any issues.
Keeping it in memory. After a few days, the memory usage reported by the Activity Monitor (so not the actual resident memory, but the one customers see and complain about) grows from maybe a few 10s MB to a few hundreds MBs.
But as far as I can tell, it's more an OS issue than really a Sqlite issue, simply doing malloc/free in a loop results in a similar behaviour. And Sqlite does a lot of allocations.
We see a similar problem on Windows, but not as pronounced, and there we can force the OS to reclaim memory.
It's probably solvable by using a custom allocator, but at this point it's no longer plug and play the way the GP meant.
while i personally really love SQLite for a lot of use-cases, i wouldn't recommend / use it "in serious production" for a django-application which does more than a simple "hello world".
why!? concurrency ... especially if your application attracts user or you just want to scale your deployment horizontally etc. ;))
so in my opinion:
* why not use sqlite for development and functionality testing
* postgresql or mariadb/mysql for (serious) production-instances :)
Yeah, I've also found that foregoing Postgres is one step too far. It's just too useful, especially with Listen/Notify making it a good task queue broker. SQLite is great, but Postgres is definitely worth the extra dependency.
Sqlite is not a very good choice for a typical CRUD app running on a web server. MySQL/Postgres/MariaDB will be much better. You can connect to it from remote and use GUI tools etc. Also much more flexible from an architectural point of view.
Sqlite seems to be the hip new thing to use where MySQL should have been used in the first place. Sqlite is great for many things, but not for the classic CRUD web app.
while i'm currently using sqlite, since it has only one write transaction at a time, if I understand correctly, opening a tx and doing outside requests while in the transaction could potentially block your whole application if the outside requests keep timing out... so you kinda need to watch out for that
Treat transactions like mutexes has always been the prevailing wisdom has not not? Keep them as short as possible and do not make blocking calls within one.
This would be true for any database, something read / written during a transaction would block at least that table universally until the transaction is finalised.
> Most applications don't need to be single-page apps nor require heavy frontend frameworks. Even for those that can benefit from it, traditional Django views is just fine for 80% of the pages. For the rest, consider AlpineHJS/HTMX
Doesn't that contradict "learn one tool well"?
I write every webpage in React, not because I think everything needs to be an SPA, but because enough things end up needing client-side state that I need something that can manage it well, and at that point it's easier to just do everything in React even if it initially seems like it would be too heavy, just like your example.
Agree with almost everything, but Celery is pretty common in my Django projects. I don't like the complexity cost, but especially when using some PaaS for hosting, it's usually the least painful option. I kinda always start out thinking this time I'll manage without, and then I have a bunch of jobs triggered via HTTP calls running into timeouts. At that point it's either threads, cron jobs (tricky with PaaS) or Celery. What's your approach?
I do the same, it's easy enough and doesn't require a ton of hosting logic.
Out of interest, how do you run your migrations in production, deploy the service then run an ad-hoc job with the same container again? That was one thing I was never super happy with.
In an ideal world your code-base is always compatible with the previous migration state.
So new version can work with the previous version's DB schema.
Then, yes, simply run the migration in a transaction once the new code is deployed. Postgres has fully transactional DDL changes which helps.
Of course, it heavily depends on the actual change being made. Some changes will require downtime or must be avoided if it is too heavy on the db.
Another approach if the migration can be applied quickly is to just run the migrations as part of the deployment script. This will cause a downtime but can be short.
Easiest is just to do runmigrations in your docker image start commands, so DB is always migrated when the container starts.
tl;dr: It depends on the complexity of the migrations and your uptime requirements.
Heroku, AWS Elastic Beanstalk (not a fan), Fly.io. With these, it always boils down to using their Celery/RabbitMQ offering. I also used Ofelia in the past, a simple Docker image for cron jobs, but quickly outgrew it.
Maybe I'm missing it, but I don't think any of these has a nice simple built-in feature.
Fully agree. I would also say it's easy enough to use Django for (almost) everything for a self contained SaaS startup. Marketing can be done via Wagtail. Support is managed by a reusable app that is a simple static element on every page (similar to Intercom) that redirects to a standard Django page, collects some info about the issue including the user who made it (if authenticated) etc.
I try to simplify the stack further and use SQLite with Borg for backups. Caching leverages Diskcache.
Deployment is slightly more complicated. I use containers and podman with systemd but could easily be a git pull & gunicorn restart.
My frontend practices have gone through some cycles. I found Alpine & HTMX too restrictive to my liking and instead prefer to use Typescript with django-vite integration. Yes it means using some of the frontend tooling but it means I can use TailwindCSS, React, Typescript etc if I want.
How do you do background job processing without Celery? Every app/website I ever developed required some sort of background job processing, for Python I use Celery, Rails I use Sidekiq, Node.js I use Faktory. One of the biggest drawbacks (at least until a while ago) was that setting this up had to be a hacky webhook call to your own app that allowed up to N seconds request/response.
I've had success with a boring event loop which looks at the state of the system and does all the background work. It's simple to implement and much easier to manage failing tasks that need manual intervention to stop failing. It also makes it easy to implement batching of work as you always know everything that needs to be done in each loop.
I also have multiple celery applications, but I wouldn't recommend it for smaller or simplier apps.
DB-based queues are pretty common (see outbox pattern). You actually don't have much choice but an outbox in your relational DB anyway, if you want the job to be enqueued transactionally.
Your first point resonates. I had an idea I wanted to explore and decided to use Make.com and google sheets. After two hours I said, screw this, and spun up my entire idea in a Rails app in 30 minutes.
Knowing a good Swiss army tool very well is a super power.
Completely OT and apologies if rude, but you're Gary Numan the musician?
I love finding out when celebrities have talents elsewhere. And Wikipedia says you've had quite a bit of aviation experience as well.
Kinda makes my morning... lol
p.s. The inner city cynical kid in me is now required to throw in that I found Django disappointing even though every else seems to love it. Ok... identity restored... can now go back to shaking my fist at the sky as per usual...
I last touched PHP when you sprinkled it in between HTML (and deployment was FTP-ing it to a server) so I'm well out of date with that though I've heard people say that it's quite nice nowadays.
This is great, but it's for a very narrow set of programming problems.
You clearly work on web applications of moderate scale. For example, Kubernetes and Redis can suddenly become almost necessities for a back end service once it reaches a certain scale.
If you're building something yourself or in a small team, I absolutely agree with everything written in the post. In fact, I'd emphasize you should lean into this sort of quick and dirty development methodology in such a context, because this is the strength of small scale development. Done correctly it will have you running circles around larger operations. Bugs are almost always easy to fix later for a small team or solo dev operation as you can expect everyone involved to have a nearly perfect mental model of the entire project, and the code itself will regardless of the messes you make tend to keep relatively simple due to Conway's law.
In larger development projects, fixing bugs and especially architectural mistakes is exponentially more expensive as code understanding is piecemeal, the architecture is inevitably nightmarishly complex (Conway again), and large scale refactoring means locking down parts of the code base so that dozens to hundreds of people can't do anything (which means it basically never happens). In such a setting the overarching focus is and should be on correctness at all steps. The economies of scale will still move things forward at an acceptable pace, even if individual developers aren't particularly productive working in such a fashion.
Hmm, context matter a lot. Im not sure on what you consider large development projects, so maybe its even bigger then what I'm thinking on. But getting the apis between apps up and ready early and getting a working setup from the database team to the frontend teams and apps teams trough some kind of backend/api team has always proven to be the correct choice for me. Also getting it as fast as possible on to a production server so its just the dns missing from beeing in production help so much for testing and highlights bug and other problems between teams. So the author mostly talks about this from a code perspective, but IMHO its even more important on larger teams.
(Sidenote: having this kind of architecture where you create layer of deps from one team to another is a bad idea from my point of view, but is still done a lot)
It's a scale. On the one extreme you have solo development, on the other you have the gargantuan code bases at e.g. Google, or the Linux kernel.
What you're describing is somewhere in the middle (if you imagine a logarithmic scale), it's at a point where working like a solo dev begins to break down especially over time, but not at a point where it's immediately catastrophic.
Startups sometimes work in that sort of hybrid mode where they have relatively low quality code bordering on unmaintainability, where they put off fixing its problems into the future when they've made it big.
In that case your system is a legitimate candidate for Micro Services.
Services that can each be maintained by a small team, with a clean, understandable API, appropriate protections for data (both security and consistency) and easily predictable cost and behavior.
Then you can compose these services to get more complex functionality.
I've found that a "rough draft" is pretty hard to maintain as a "draft," when you have a typical tech manager.
Instead, it becomes "final ship" code.
I tend to write ship code from the start, but do so, in a manner that allows a lot of flexibility. I've learned to write "ship everywhere," even my test harnesses tend to be fairly robust, ship-Quality apps.
A big part of that, is very high-Quality modules. There's always stuff that we know won't change, or, if so, a change is a fairly big deal, so we sequester those parts into standalone modules, and import them as dependencies.
Here's an example of one that I just finished revamping[0]. I use it in this app[1], in the settings popover. I also have this[2] as a baseline dependency that I import into almost everything.
It can make it really fast, to develop a new application, and can keep the Quality pretty high, even when that's not a principal objective.
Tangent, is it a Swift thing to have "* ################################################################## / comment markers ?
It becomes quickly very visually dominant in the source code:
> / ###################################################################################################################################### /
// MARK: - PUBLIC BASE CLASS OVERRIDES -
/ ###################################################################################################################################### */
Nope. It's a "Me" thing. I write code that I want to see. I have fairly big files, and it makes it easy to scroll through, quickly. It also displays well, when compiling with docc or Jazzy.
My comment/blank line-to-code ratio is about 50/50. Most of my comments are method/function/property headerdoc/docc labels.
Here's the cloc on the middle project:
github.com/AlDanial/cloc v 2.04 T=0.03 s (1319.9 files/s, 468842.4 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Swift 33 1737 4765 5220
-------------------------------------------------------------------------------
SUM: 33 1737 4765 5220
-------------------------------------------------------------------------------
> For example, if you’re making a game for a 24-hour game jam, you probably don’t want to prioritize clean code. That would be a waste of time! Who really cares if your code is elegant and bug-free?
Hate to be an anecdote Andy here, but as someone who has done a lot of code review at (non-game) hackathons in the past (primarily to prevent cheating), the teams that performed the best were also usually the ones with the best code quality and often at least some rudimentary testing setup.
The gaming use case is what makes this apt advice. If you've got 24h to make a game and you're spending more than ~1h worrying about the source code cleanliness, I don't think it's gonna go well.
Systems like UE blueprints showcase how pointless the pursuit of clean anything is when contrasted with the resulting product experiences.
If you're an experienced developer, writing clean code doesn't add any more time than writing shitty code. It's just ingrained habits of what makes for the best productivity.
I think some people look at code cleanliness in aggregate (this code base is dirty vs. it is elegant and bug free) and others have a fine grained cost/benefit analyses of nonfunctional code improvements.
Im pretty sure the latter vastly outpeform the former under all circumstances - whether quick and dirty hackathon or ultra hardened production code.
> It can reveal “unknown unknowns”. Often, prototypes uncover things I couldn’t have anticipated.
This is the exact opposite of my experience. Every time I am playing around with something, I feel like I'm experiencing all of its good and none of its bad ... a honeymoon phase if you will.
It's not until I need to cover edge cases and prevent all invalid state and display helpful error messages to the user, and eliminate any potential side effects that I discover the "unknown unknowns".
I think you're talking about unknown unknowns in the tool/framework/library. I think the author is talking about unknown unknowns in the problem space.
I was talking about both. Sometimes even in a problem space time constraints demand that you utilize something off the shelf (whether you use part of it or build on top of a custom version of it).
Tools aside, I think everyone who has 10+ years can think of a time they had a prototype go well in a new problem space only to realize during the real implementation that there were still multiple unknown unknowns.
Yeah, typically when you start thinking something through and actually implementing stuff you can notice that some important part of the behaviour is missing and it might also be something that means that the project is no longer feasible
Yes. I wanted to warn about a rough draft being too rough. There are corners one shouldn't cut because this is where the actual problems are. I guess that rally pilots do their recon at a sustained pace, otherwise they might not realize that e.g. the bump there before the corner is vicious.
Ye it is something like how making tools just you yourself use is so smooth. Like, they can be full of holes and be a swaying house of cards in general but you still can use them sucessfully.
> For example, if you’re making a game for a 24-hour game jam, you probably don’t want to prioritize clean code. That would be a waste of time! Who really cares if your code is elegant and bug-free?
Having worked on some 24-hour game jams and similar, I've found completely the opposite. It's when you're in a real hurry that you really can't afford bad code. Writing better code will make it easier to get it right, will put less pressure on my working memory, will let me try things faster and make those last-minute changes I wanted, will make adding features towards the end easier rather than harder and, crucially, will both reduce the chance that I need to do intense debugging and make debugging I need to do easier.
Working with good code just feels lighter.
The thing that breaks 24-hour projects isn't writing code too slowly, it's writing yourself into a corner or hitting some problem that derails your work, takes hours to solve or doesn't even get resolved until after the deadline.
A game jam isn't the place to try to squish all bugs, sure, but that's a question of what you're doing, not how. I still want to write good code in that situation because it makes my life easier within the time constraints, and because, even if I'm fine with some bugs, there are still lots of bugs that render a game unpleasant or unplayable.
I'll need to fix some bugs no matter what; I'd rather fix fewer, easier bugs than more, harder bugs!
The same thing applies to longer time horizons too. When you have more time you have more slack to deal with bad code, but that doesn't mean it makes any more sense to write it!
And, of course, once you get in the right habits, writing solid quality code becomes basically free... but, even if it really did meaningfully slow you down, chances are it would still be worth doing in expectation.
I second this. I've done lots of game jams and I think the "messy code" threshold for me is like, 1-2 hours away from the deadline at most, on files nobody else will touch. It depends on the type of cleanup, but factoring out common logic really doesn't take that long.
As the above comment says, in my experience bugs introduced from messy code are way more likely than the time savings of not cleaning up code.
The usual exception I'd make are things that like, mostly the same but not quite (e.g. a function to fade out a light over time vs a function to fade out a color over time). Often I find requirements for those diverge over time, so I'll just leave some repeated patterns across both.
Using a different algorithm is a change in what you're doing, not how you're doing it, so I'd see that as qualitatively different from writing bad code.
I think it's a misconception that writing good code must take longer than writing bad code. At least if you want it to vaguely satisfy some requirements.
> What is my team’s idea of “good enough”? What bugs are acceptable, if any? Where can I do a less-than-perfect job if it means getting things done sooner?
I hop between projects regularly, and this has been the biggest source of inter-team conflict in my career.
Different people from different backgrounds have an assumed level of what "good enough". The person from big tech is frustrated because no one else is testing thoroughly. The person from a startup is frustrated because everyone else is moving too slow.
It would be nice if the "good enough" could be made explicit so teams are on the same page.
Isn't the current layoff-heavy tech world the biggest threat to software quality and engineer productivity?
The perpetual looming threat of layoffs, the need to deliver wins ASAP, stifles creativity, punishes experimentation, and pushes people to burnout. It forces people into groupthink about topics like AI. Nobody can say the emperor has no clothes (emperor being leadership or the topic du-jour)
The biggest threat to software quality, has always been and will always be that consumers don't pay for quality.
Consumers that have good taste (or at least perceive differences in quality) are not numerous enough to support new products that differentiate themselves only with quality. And they (the consumers) are not successful enough in their own enterprises to pay extra for better quality products.
It's easier to find examples where people do pay for quality outside of software. Look at the spectrum in quality available for vehicles or household appliances.
Yup. As engineers, we MIGHT care about code quality. The end user just cares if something works in the way they want/expect it to. A lot of large successful companies have rough code quality.
The biggest threat is vendor-lockin at the programming level, far more destructive than SAAS lockin. We already have monopolies in hardware now we are about to have monopolies in software by the same companies who monopolized the hardware. Giving them so much power that there will no longer be computer programmers, there will only be LLM prompters.
An important dimension that is not really touched upon in the article is development speed over time. This will decrease with time, project and team size. Minimising the reduction rate may require doing things things that slow down immediate development for the sake of longer term velocity. Some examples would be test, documentation, decision logs, Agile ceremonies etc.
Some omissions during initial development may have a very long tail of negative impact - obvious examples are not wiring in observability into the code from the outset, or not structuring code with easy testing being an explicit goal.
Even as a solo developer, I can swear by decision logs, test and documentation, in that order. I personally keep a "lab notebook" instead of a "decision log" which chronicles the design in real-time, which forms basis of the tests and documentation.
Presence of a lab notebook allows me to write better documentation faster, even if I start late, and tests allow me to verify that the design doesn't drift over time.
Starting blind-mindedly for a one-off tool written in a weekend maybe acceptable, but for anything going to live longer, building the slow foundation allows things built on this foundation to be sound, rational (for the problem at hand) and more importantly understandable/maintainable.
Also, as an unpopular opinion, design on paper first, digitize later.
Right, an important part of keeping in mind other future developers working on your codebase. You 6 months later is that other developer once the immediate context is gone from your head. :)
This is very familiar. Rough draft, some manual execution often wrapped in a unit test executor, or even written in a different scripting language just to verify the idea. This often helped me to show that we don't even want to build the thing, because it won't work the way people want it to.
The part about distraction in code feels also very real. I am really prone to "clean up things", then realize I'm getting into a rabbit hole and my change grows to a size that my mates won't be happy reviewing. These endeavors often end with complete discard to get back on track and keep the main thing small and focused - frequent small local commits help a lot here. Sometimes I manage to salvage something and publish in a different PR when time allows it.
Business mostly wants the result fast and does not understand tradeoffs in code until the debt hits the size of a mountain that makes even trivial changes painfully slow. But it's about balance, which might be different on different projects.
Small, focused, simple changes definitely help. Although, people are not always good at slicing a larger solution into smaller chunks. I sometimes see commits that ship completely unused code unrelated to anything with a comment that this will be part of some future work...then prio shifts, people come and go, and a year later we have to throw out all of that, because it does not apply to the current state and no one knows anymore what was the plan with that.
> Data modeling is usually important to get right, even if it takes a little longer. Making invalid states unrepresentable can prevent whole classes of bugs. Getting a database schema wrong can cause all sorts of headaches later
So much this.
Get the data model right before you go live, and everything is so simple, get it wrong and be prepared for constant pain balancing real data, migrations, uptime and new features. Ask me how I know
I came of age in SW dev when we started with the (database)schema. THis doesn't seem to be common any more and I regularly see experienced devs with low to no SQL exposure. Seems they typically work at an abstraction (or 2 or 3) above the API or maybe the ORM, but would struggle to write the resultant query, let alone profile it.
I'm not convinced this was a good abstraction that really helps us be more effective.
APIs, data models and architecture are the main things you can't Agile your way out of. You need to get them right up front before you start iterating on the implementation.
We encounter many rough drafts (yours) in production systems. If the original devs are still there, it is usually something along the lines of: I showed the rough draft to my manager, they flagged is as done and I was assigned to another task.
I see this an awful lot. Msot recently was presentation of hackathon projects with product managers & executives asking how much work was left to turn them into production features. It's pretty obvious how their brains are spinning.
We already see it; the combination of job-hopping & AI are a perfect storm really.
From launch to failure is definitely getting fast-tracked; few months ago we had yet another hospital system that just lost data; reading the code (no tests, no peer reviews; they don't use versioning) shows clear signs of LLMs; many files that do almost the same thing, many similar function names that do almost the same thing or actually the same thing, strange solutions to trivial problems, etc. The problem was that there was a script ran at startup which added an admin user, but if an admin user already exists, it truncates the table. No idea why, but it wasn't discovered earlier because after testing by the devs it was put live, devs left (contractors) and it just ran without issues until the ec2 instance needed maintenance by aws and was as such rebooted after which all users were gone. Good stuff. They paid around 150k for it; that is not a lot in our field but then to get this level of garbage is rather frightening. This was not life threatening, but I know if it was, it would not be better as you cannot make this crap up.
And let us be very clear: this happened REGULARLY pre-AI, with long-lived systems adding contractors 12, 6 or less months over 20 or more years. Now we have AI that allows even faster iterations of this with even less conceptual integrity.
The big problem: the decision makers(c-suite executives) never really understood what was happening before, so you can't expect them to see the root cause of the problem we're actively creating. This means it will not get the attention and resourcing needed - plus they'll be gone and on to the next one after taking a huge payday for slashing their R&D costs.
I wonder if the rough draft approach is a good prompt for an agent. Since it can draft more quickly, you can review more quickly & get it on the right track.
A lot of established dev practices - like this one - are effective with AI generation. Another is the super-valuable but less common product spec that spends a lot of effort and verbiage defining what is NOT included. LLMs are helped greatly with explicit guardrails and restrictions.
I fear this is different from the "code slop jello poured over a bespoke marshmallow salad of a system" problem though. Mostly for the same reasons that Brooks described that make SW inherently hard 60+ years ago. It feels like the JS framework / SPA experience but with every.single. developer. and the 10x "improvement" is just speed.
Work with bad companies, be surprised by poor managers? Who is the "we" in this context, I assume an agency?
So that's not a problem with this process itself. You're describing problems with managers, and problems with developers being unable to handle bad managers.
Even putting aside the manager's incompetence, as a developer you can mitigate this easily in many different ways, here's a few:
- Just don't show it to management
- Deliberately make it obviously broken at certain steps
- Take screen shots of it working and tell people "this is a mockup, I still have to do the hard work of wiring it up"
It's all a balancing act of needing to get feedback from shareholders and managing expectation. If your management is bad, you need to put extra work into managing expectations.
It's like the famous duck story, from Jeff Atwood (see jargon number 4), sometimes you have to manage your managers:
Sure, but we actually thrive here; my company gets called in when systems are not functioning, badly broken, etc and they cannot fix it themselves (usually because the people who built it are gone for decades and they just kept it running with ductape for this time). We never stay for long, we just patch the system and deliver a report. But for figuring out what went wrong and writing the report, we find out how it got to be that way and it's always the same; they suck. Talking banks, hospitals, factories, it really doesn't matter; it's all garbage what gets written and 'TODO: will refactor later' is all over the place. We see many companies from the inside and let me tell you; HN is a lovely echo chamber that resembles nothing in the real world.
I think the main lesson here is that most entities shouldn't be writing serious software themselves, but purchase software from reputable software companies whenever possible. At least who to hold responsible or sue is clearer in that case.
I actually try to build it "well" in the first pass, even for prototyping. I'm not gonna say I succeed but at least I try.
This doesn't mean writing tests for everything, and sometimes it means not writing tests at all, but it means that I do my best to make code "testable". It shouldn't take more time to do this, though: if you're making more classes to make it testable, you're already messing it up.
This also doesn't mean compromising in readability, but it does mean eschewing practices like "Clean Code". Functions end up being as large as they need to be. I find that a lot of people doing especially Ruby and Java tend to spend too much time here. IMO having lots of 5-line functions is totally unnecessary, so I just skip this step altogether.
It also doesn't mean compromising on abstractions. I don't even like the "rule of three" because it forces more work down the line. But since I prefer DEEP classes and SMALL interfaces, in the style of John Ousterhout, the code doesn't really take longer to write. It does require some thinking but it's nothing out of the ordinary at all. It's just things that people don't do out of inertia.
One thing I am a bit of hardliner about is scope. If the scope is too large, it's probably not prototype or MVP material, and I will fight to reduce it.
EDIT: kukkeliskuu said below "learn one tool well". This is also key. Don't go "against the grain" when writing prototypes or first passes. If you're fighting the framework, you're on the wrong path IME.
I personally find that doing it well in the first pass slows me down and also ends up in worse overall designs.
But I am also pretty disciplined on the 2nd pass in correcting all of the hacks and rewriting everything that should be rewritten.
There are two problems I have with trying to do it right the first time:
- It's hard to know the intricacies of the requirements upfront without actually implementing the thing, which results in designing an architecture with imperfect knowledge
- It's easy to get stuck in analysis paralysis
FWIW I am a huge fan of John Ousterhout. It may be my all time favorite book on software design.
I have found that too much coupling between product requirements and the architecture can be detrimental. It's often the reason why people tend do too much upfront work, but also slows down the evolution of the feature.
So I don't really want to know the future requirements, or refactor on the 2nd pass to "match".
If some feature needs too many modifications or special cases in the current architecture, it's a round peg in a round hole. I prefer to have those places be a bit more "painful" in the code. The code doesn't have to be bad per se, but it should be clear that something different and not traditional is happening there.
This pretty much exactly describes my strategy to ship better code faster. Especially the “top down” approach: I’m actually kind of surprised there isn’t like a “UI first” or “UI Driven Development” manifesto like w TDD or BDD. Putting a non functional UI in front of stakeholders quickly often results in better requirements gathering and early refinement that would be more costly later in the cycle.
Well, sometimes I will, but for example take a simple list+form ontop of a database. Instead of building the UI and the database and then showing the stakeholder, who adds/renames fields, changes relationships etc. I will intentionally build just the UI not wired up to database. Sometimes just to an in-memory store or nothing. Then, _after_ the stakeholder is somewhat happy with the UI, I "bake" things like a service or data layer, etc. This way the changes the stakeholder inevitably has up front have less of an impact.
Well, most of the times people I worked with preferred something earlier even if just by a few days that they could see and comment on. Maybe that is why for him too.
This post resonates deeply with how I build products, especially in the era of LLMs and AI-assisted coding.
I usually start top-down, sketching the API surface or UI scaffold before diving into real logic. Iteration drives the process: get something running, feel out the edges, and refine.
I favor MVPs that work end-to-end, to validate flow and reduce risk early. That rhythm helps me ship production-ready software quickly, especially when navigating uncertainty.
One recent WIP: https://zero-to-creator.netlify.app/. I built it for my kid, but I’m evolving it into a full-blown product by tweaking the edges as I go.
One important aspect, also highlighted by others, is that for the long term you actually _don't_ want to focus solely on the immediate task you're solving. Sure, short term the tasks are getting done quicker, but since the end goal typically is implementing a full coherent solution you _have_ to step back and take a look at a bigger picture every now and then. Typically you won't be allocated specific time when to do this, so this "take a bird's eye view" part has to be incorporated into day-to-day work instead. It's also typically easier to notice bigger issues while you're already in the trenches, compared to doing "cleanup" separately "later".
Just one thing I'd like to throw out here after far too long in the industry: It's very hard to tell ahead of time just how users will put computers/software to use. Lots of generic, off-the-shelf HW and SW get used in ways that are ultimately 'life critical' to someone.
Something to keep in mind for design/development/testing.
The initial rough draft almost reminds me of the old "Build One to Throw Away" approach, which I think is pretty nice - not getting caught up in making something production ready, but rather exploring the problem space first.
I do admit that modern frameworks also help in that regard, instead of just stitching libraries together, at least for typical webdev stuff instead of more minimalistic CLI utilities or tools. The likes of Ruby on Rails, Django, Laravel, Express and even the likes of ASP.NET or Spring Boot.
I love his writing too! I read this post a few days ago and really liked it, so I started going through his older posts. It's no coincidence that his writing is good—he's actively working to improve it: https://evanhahn.com/economist-style-guide-book-takeaways/.
This is very important and requires some foresight when the real data is personally identifiable information, private health information, etc.
It's possible, but requires designing a safe way to run pre-production code that touches production data. Which in practice means you better be sure you're only doing reads, not writes, and running your code in the production environment with all the same controls as your production code.
I meant data heterogeneity - the variety in formats, edge cases, and data quality you encounter in production. Real user data often has inconsistencies, missing fields, unexpected formats, etc. that synthetic test data tends to miss.
This helps surface integration issues and performance bottlenecks early.
i often use c# and Visual Studio to write prototype code. C# can be used in a C like syntay (my destination language) and has much better turnaround times
Building software quickly seems to mostly come from having enough examples of code you've already built that you can pull from.
I recently re-made my web based notes app. Before working on this project I made a web based S3 file manager (e.g. CRUD operations in my own UI).
Instead of trying to store notes in a database or something, I just yoinked the S3 file manager code and store my notes in S3. Just a few tweaks to the UI and a few more features and now I have a notes app.
Fast builds are important. I've been doing server side stuff for a few decades and there are some things you can do to turn slow builds into fast builds. I mostly work on the JVM but a lot of this stuff ports well to other stacks (e.g. Ruby or python).
Basically there are things you can't avoid that are not necessarily fast (e.g. compilation, docker build, etc.) and things that you can actually control and optimize. Tests and integration tests are part of that. Learning how to write good effective tests that are quick to run is important. Because you might end up with hundreds of those and you'll be spending a lot of your career waiting for those to run. Over and over again.
Here's what I do:
- I run integration tests concurrently. My CPUs max out when I run my tests. My current build runs around 400 integration tests in about 35 seconds. Integration test means the tests are proper black box tests that hit a REST API with my server talking to a DB, Elasticsearch and Redis. Each test might require users/teams and some content set up. We're talking many thousands of API calls happening in about 35 seconds.
- There is no database cleanup in between tests. Database cleanup is slow. Each build starts with an ephemeral docker container. So it starts empty but by the time the build is over you have a pretty full database.
- To avoid test interaction, all data is randomized. I use a library that generates human readable names, email addresses, etc. Creating new users/teams is fast, recreating the database schema isn't. And because at any time there can be 10 separate tests running, you don't want this anyway. Some tests share the same read only test fixture and team. Recreating the same database content over and over again is stupid.
- A proper integration test is a scenario that is representative of what happens in your real system. It's not a unit test. So the more side effects, the better. Your goal is to find anything that might break when you put things together. Finding weird feature interactions, performance bottlenecks, and sources of flakiness is a goal here and not something you are trying to avoid. Real users don't use an empty system. And they won't have it exclusive to themselves either. So having dozens of tests running at the same time adds realism.
- Unit tests and integration tests have different goals. With integration tests you want to cover features, not code. Use unit tests for code coverage. The more features an integration test touches, the better. There is a combinatorial explosion of different combinations of inputs. It's mathematically impossible to test all of them with an integration test. So, instead of having more integration tests, write better scenarios for your tests. Add to them. Refine them with detail. Asserting stuff is cheap. Setting things up isn't. Make the most of what you setup.
- IMHO anything in between scenario tests and unit tests is a waste of time. I hate white box tests. Because they are expensive to run and write and yet not as valuable as a good blackbox integration test. Sometimes you have to. But these are low value, high maintenance, expensive to run tests. A proper unit tests is high value, low maintenance and very fast to run (it mocks/stubs everything it needs, there is no setup cost). A proper integration tests is high value, low maintenance, and slow to run. You justify the time investment with value. Low maintenance here means not a lot of code is needed to set things up.
- Your integration test becomes a load and stress test as well. Many teams don't bother with this. I run mine 20 times a day. Because it only takes less than a minute. Anything that increases that build time, gets identified and dealt with. My tests passing gives me a high degree of certainty that nothing important has broken.
- Most of the work creating a good test is setting up the given part of a BDD style test. Making that easy with some helper functions is key. Most of my tests require users, teams, etc. and some objects. So I have a function "createTeam" with some parameters that call all the APIs to get that done. This gets called hundreds of time in a build. It's a nice one liner that sets it up. Most of my tests read like this: create a team or teams, do some stuff, assert, do more stuff, assert, etc.
- Poll instead of sleeping. A lot of stuff happens asynchronously so there is a lot of test code that waits for shit to happen. I use kotest-assertions which has a nice "eventually" helper that takes a block and runs that until it stops throwing exceptions (or times out). It has configurable interval that it tries again that backs off with increasing sleep periods. Most things just take a second or two to happen.
- If your CPUs are not maxed out during the test, you need to be running more tests, not less. Server tests tend to be IO blocked, not CPU blocked. And your SSD is unlikely to be the bottleneck. We're talking network IO here. And it's all running on localhost. So, if your CPUs are idling, you can run more tests and can use more threads, co-routines, whatever.
- Get a decent laptop and pay for fast CI hardware. It's not worth waiting 10 minutes for something that could build in about a minute. That speedup is worth a lot. And it's less likely to break your flow state.
This stuff is a lot easier if you engineer and plan for it. Introducing concurrently running tests to a test suite that isn't ready for it can be hard. Engineering your tests to be able to support running concurrently results in better tests. So if you do this properly, you get better tests that run faster. Win win. I've been doing this for a while. I'm very picky about what is and isn't a good test. There are a lot of bad tests out there.
I find people overfocus on fast running tests, often to the exclusion of tests which test realistically and loosely couple to the code.
This is a pretty easy and natural thing to do because it's quite easy to go "I shaved 2.5 minutes off my build" whereas "I increased the maintainability and realism of our tests, adding 3 minutes to the build" is a much more nebulous and hard thing to justify even when it does save you time in the long run.
As Drucker says, what gets "measured gets managed" <- quantifiable metrics get more attention even when they're less important.
>A proper unit tests is high value, low maintenance and very fast to run (it mocks/stubs everything it needs, there is no setup cost).
^^ this is a case in point, mocks and stubs do make fast running test code but they commensurately decrease the realism of that test and increase maintenance overhead. Even in unit tests I've shifted to writing almost zero mocks and stubs and using only fakes.
I've had good luck writing what I call "end to end unit tests" where the I/O boundary is faked while everything underneath it is tested as is, but even this model falls over when the I/O boundary you're faking is large and complex.
In database heavy applications, for instance, so much of the logic will be in this layer that a unit test will demand massive amounts of mocks/stubs and commensurate maintenance and still tell you almost nothing about what broke or what works.
Fast and comprehensive are not mutually exclusive goals. Having fast tests makes it more likely you'll add more and better tests as well. Because the cost of doing that gets lower. I have some pretty comprehensive tests that setup fairly complicated scenarios. The cost for that is low.
A slow test will be something you avoid running when you should be. When it takes 20 minutes to validate a change you did it gets tempting to skip it or you post pone doing it. Or you'll do it and get side tracked by something else. The ability to run high quality integration tests quickly is a super power. We're used to these things running slowly but my point is that you can engineer it such that it's fast and that's worth doing.
IMHO a key mistake is treating integration tests as unit tests. Which then dictates you do silly things like running expensive cleanup logic and isolating tests for each other and giving them their own virgin system to run against. That actually makes your tests less valuable and more tedious to run.
The real world is messy. A good integration test benefits from the noise created by lots of other tests running. It's the closest you can get to a real running live system without using the actual live running system and testing in production. Real users will never see a virgin system and they won't be alone in the system. It's OK for there to be data in the database. You can isolate through other means: give tests their own users. Randomize key things so they don't clash with other tests, etc. This results in better tests that actually run faster.
I love my unit tests as well. But I don't unit test things that I cover with an integration test anyway. I reserve those for things that are proper units that I can easily test in isolation. Anything with complicated logic, regular expressions, or algorithms basically. Testing that with an integration tests is counter productive because your goal is to test the logic and you probably want to do that with lots of different inputs. And then you mock/fake anything that just gets in the way of testing that.
But unit testing APIs is silly if you are in any case writing proper full blown integration / scenario tests that use those APIs. I don't need to unit test my database layer with an in memory database either. If it's at all important, that functionality will be used as part of my integration tests triggering logic that needs a database. And it will run on a proper database. And I can use my API from the outside to evaluate the output and assert everything is as it should be without poking around in that database. This adds more API calls and realism to the test and ensures I don't need to make assumptions about the implementation. Which then means I can change the implementation and validate that it didn't break anything important.
This is why I have grown to appreciate gradual typing, at least for solo projects. In Python-land I can just riff over a few functions/scripts until I get a rough idea of the APIs/workflows I want, then bring mypy into the mix and shape things into their "final" form (this takes maybe a few hours away). Rinse repeat for each new feature, but at every iteration you build up from a "nicely-typed" foundation.
Sometimes a redesign of the types you relied on becomes necessary to accommodate new stuff, but that would be true under any language; otoh, the "exploratory" part of coding feels faster and easier.
"Learn to walk before you run. I firmly believe in this wisdom. First build a foundational model, then tackle boundary challenges and efficiency optimization."
In recent years, I have learned how to build sufficiently robust systems fast.
Here are some things I have learned:
* Learn one tool well. It is often better to use a tool that you know really well than something that on the surface seems to be more appropriate for the problem. For extremely large number of real-life problems, Django hits the sweet spot.
Several times I have started a project thinking that maybe Django is too heavy, but soon the project outgrew the initial idea. For example, I just created a status page app. It started as a single file Django app, but luckily realized soon that it makes no sense to go around Djangos limitations.
* In most applications that fit the Django model, data model is at the center of everything. Even if making a rought prototype, never postpone data model refactoring. It just becomes more and more expensive and difficult to change over time.
* Most applications don't need to be single-page apps nor require heavy frontend frameworks. Even for those that can benefit from it, traditional Django views is just fine for 80% of the pages. For the rest, consider AlpineHJS/HTMX
* Most of the time, it is easier to build the stuff yourself. Need to store and edit customers? With Django, you can develop simple a CRM app inside your app in just few hours. Integrating commercial CRM takes much more time. This applies to everything: status page, CRM, support system, sales processes, etc. as well as most Django apps/libraries.
* Always choose extremely boring technology. Just use python/Django/Postgres for everything. Forget Kubernetes, Redis, RabbitMQ, Celery, etc. Alpine/HTMX is an exception, because you can avoid much of the Javascript stack.
> Forget Kubernetes, Redis
While I agree with you, these two are the boring tech of 2025 for me. They work extremely reliably, they have well-defined use cases where they work perfectly and we know very well where they shouldn't be used, we know their gotchas, the interest around them seems to slowly wane. Personally, I'm a huge fan of these, just because they're very stable and they do what they are supposed to do.
A lot of the conversation about "simplicity" with kubernetes ignores that there are MANY different distributions of k8s.
I would argue that k3s on a vm is not too much harder to setup than a normal reverse proxy setup. And the configs are a bit easier to find on a whim when coming back after months.
Obviously a full self hosted setup of k8s is pretty large task but in other forms its a competitive option in the "simple" area.
Last time I tried what stumped me was getting "ReadWriteMany" volumes working.
My use case was pretty straightforward, a series of processing steps with the caveat that storing the actual results in a DB would be bad due to the blobs size. So, instead, the DB is used to signal downstream consumers files are ready and instead write the files in the ReadWriteMany volumes so that downstream files can simply read them.
I tried Longhorn, but even if I manage to get a volume showing up in the UI, it was never detected as healthy and after a while I refactored the workflow to use Apache Beam so that I could drop databases and volumes and run everything from within a single beefy VM.
Is it still an issue (it was a while ago TBH)?
Ya this kind of usecase is for sure straying into the "this is more of a pain than its worth" with regards to k8s. Esp when you can just do basic nfs mounts in nix anyways.
If you are already familiar with k8s volumes and csi's its not a huge problem but if you arent its not worth learning if your goal is simple. At least in my opinion.
Thank you for taking the time to confirm this.
Yeah, that is also my opinion after that experience.
Also, it was really frustrating as the number of moving parts that have to play nice to each other is high.
Use a bucket for this. Or if you’re in a home lab use a boring old NFS server outside k8s
Thanks. The Apache beam code in the end doesn't need a bucket but the HDD attached to the VM was more than enough to store everything and then moving things out of it as the final step.
K8s is chock full of complexity though. Complexity that the vast majority of applications do not, and will not need
I only used it once forever ago, and all I remember is being very confused about pod vs service and using a lot of magic strings to get load-balancing to work.
I mean a basic k3s install + a barebones deployment combined with a nodeport service is about as trivial as you can get. its not THAT much more "lines of config (60-70 max)" then an nginx conf + "git clone + other stuff" deploy script.
I dunno, a sane nginx config is like 30 lines max for a basic service. You can mange it entirely with git (`git config receive.denyCurrentBranch updateInstead`) and some post-receive / push-to-checkout hooks.
Keep in mind, for K8s to work you also need to containerize your application, even if it's just static files (or I guess you can break containment and do volume mounts, or some sort of static file resource if you're serving static content). For non-static applications, you still need to configure and run it, which is just harder with k8s/docker in the mix.
Agree. Maybe it should be more like "forget learning Kubernetes, Redis or anything you don't already" know since you don't want to bifurcate time between learning curve and product.
I also like Django a lot. I can get a working project up and running trivially fast.
In my day job I work with Go and while it's fine, I end up writing 10x more code for simple API endpoints and as soon as you add query parameters for filtering, pagination, etc. etc. it gets even longer. Adding a permissions model on top does similar. Of course there's a big performance difference but largely the DB queries dominate performance, even in Python, at least for most of the things I do.
Yes I really wish for something like Django for a statically typed language (maybe Spring? Haven't tried it).
I'm writing a CRUD CLI ( https://github.com/bbkane/enventory/ ) partly to practice doing it in Go, and while I'm mostly happy with the resulting code, theres just a whole lot of it. At least it's simple enough that I can trust LLMs to ok jobs with detailed prompts (example: https://github.com/bbkane/enventory/blob/master/.github/prom... )
Springboot is pretty good, if java/kotlin are your lang of preference.
Still not the same bells and whistles as Django, but really good otherwise. The database interactions are pretty good.
hello,
as always: imho. (!)
hmmm ... i think spring-boot combined with either java or kotlin is a very good alternative to django.
even so i wouldn't compare them directly, but static typing avoids a lot of problems.
idk ... for me personally one of djangos great features is its custom db-model and relative (!) painless db schema-migration.
for spring-boot i often went with the tool flyway for handling db-migration ...
just my 0.02€
You mention Django, but these days, are you using the full Django experience or are you mostly writing REST APIs?
In Java land a very nice and much lighter weight framework is Dropwizard: https://github.com/dropwizard/dropwizard
It's basically a sort of All-Stars collection of Java libraries, nicely packaged and with some nice conventions.
Towards the more servless route route there's Micronaut: https://micronaut.io/
I haven't tried Dropwizard. Is it a batteries included as Spring Boot? Eg: How is authentication support? Do we need a lot of boilerplate to implement let's say an OAuth2 resource server?
oh that's interesting. Is that due to missing libraries in Go? That could be a nice open source project if so.
There's an almost pathological resistance to using anything that might be described as a 'framework' in the Go community in the name of 'simplicity'.
I find such a blanket opinion to be unhelpful, what's fine for writing microservices is less good for bootstrapping a whole SaaS app and I think that people get in a bit too much of an ideological tizz about it all.
In my experience frameworks (in contrast to "libraries") add complexity making your application harder to understand and debug once you get past the initial prototype.
Also, by their very nature they almost require you to write a program of a minimum size even when a simpler program would do to solve your problem.
> Also, by their very nature they almost require you to write a program of a minimum size even when a simpler program would do to solve your problem.
I'm not sure I agree, sure they require you to pull in a large dependency, and that might not be good for some use cases (particularly microservices) where you're sensitive to the final size of the image but for e.g. with Django you get spawned a project with settings.py which configures your database, middleware, etc. and urls.py which configures your routes, and then you're really free to structure your models and endpoints however you like.
I don't think anyone would advise to do everything from scratch all the time.
It's mostly about libraries vs opinionated frameworks.
No one in their right mind would say: just use the standard library but I've seen it online. That discourse is not helping.
I think people get this miscontrued on both sides.
A set of reusable, composable libraries would be the right balance in Go. So not really a "framework" either.
I think that reflects better the actual preferred stance.
It's always going to be more work with composable libraries since they don't 'flow'.
Just picking one of the examples I gave, pagination - that requires (a) query param handling (b) passing the info down into your database query (c) returning the pagination info in the response. In Django (DRF), that's all built in, you can even set the default pagination for every endpoint with a single line in your settings.py and write no more code.
In Go your equivalent would be wrangling something (either manually or using something like ShouldBindQuery in Gin) to decode the specific pagination query params and then wrangling that into your database calling code, and then wrangling the results + the pagination results info back.
Composable components therefore always leave you with more boilerplate
I guess there are pros and cons. Pros of composable libraries is that you can more easily build upon them for your specific use case.
It also doesn't tie you to ORM usage.
You have to be responsible for "your" specific flow... meaning you can build your own defaults easily wrt parsing query parameters and whatnot (building a generic paginated query builder API?). Nothing insurmountable.
That's fine as long as you never need to deviate from the default behavior of the framework.
Fortunately the framework is pretty configurable on all this sort of stuff! Just to hit pagination again since it's the example I've used in other comments, in DRF you implement a custom pagination class:
https://www.django-rest-framework.org/api-guide/pagination/#...
That's the same as for most other configurable things. And ultimately if you want to override the behaviour for some specific endpoint then you can still easily do that by just implementing your own method for it.
After working in Go now for several years, what I've found is that generally people just don't some things in their first pass because it's too much work and makes the PRs too large, and then they retrofit it afterwards. Which meant that the APIs were less consistent than the ones I worked on in Django.
> No one in their right mind would say: just use the standard library but I've seen it online.
Yes but...sometimes the proper level of abstraction is simply a good HTTP library. Where the API of your application is defined in terms of URLs, HTTP methods, status codes, etc., the http library from the Go standard library might be all you need (depending on other requirements, of course).
A real life example: I needed a simple proxy with well defined behavior. Writing it in Go as a web application that forwarded calls to another service based on simple logic only took a few pages of code in a single file.
I don't dispute that. But in general you need sessions, you need a few middleware to, for example, handle csrf or gzip compression of the response, auth, etc. Telling people to juste use the standard library doesn't help. Of course it is available. But often there is a need for more.
Look, I'm not for a moment suggesting that a proxy application would be a good fit for Django (or even Python). I'm talking about 'building a full stack web application in a reasonable amount of time, with auth, with a database, etc. etc.'
> No one in their right mind would say: just use the standard library but I've seen it online. That discourse is not helping.
I would say that.
The most important thing about standard library is its stability. You won't ever need to touch code that works with standard library. It's finished code. Other than bug fixes, of course.
Third-party libraries is a very different thing.
They gets abandoned all the time, so now you're left with burden. You either need to migrate to another library, maintain that abandoned library or live with huge chunk of code that might be vulnerable.
They gets changed often enough, as their developers probably not so careful about backwards compatibility, compared to core language developers.
Third-party library is a liability. Very rarely its source code is an ideal fit to your application. Often you'll use 10% of the library, and the rest is dead weight at best, vulnerability source at worst. Remember log4shell? Instead of using standard logging code in Java, some developer decides to pull log4j library which is very nice to use, has lots of features. It can even download and execute code behind your back, very featureful.
Of course I'm not advocating to rewrite the world. This is insane. Some problems are just too big to solve by yourself. I also should note, that different ecosystems have different approaches to the language library and overall library culture. JS is terrible, while Go is not that bad, but it's not ideal either.
But my absolutely 100% always approach is to use standard library first and foremost. I won't use third-party library just to save few lines of code or make code more "beautiful". I prefer dependency-free boring repetitive code any day.
And if I'm using third-party library, I'm very picky about its stability and transitive dependencies.
It also depends on kind of company. My experience has always been: you write some service, you throw it at production, it works for the next 20 years. So you want this code to be as self-contained as possible, to reduce "chore" time with dependency management. Perfect application is a dependency-free software which can be upgraded by changing "FROM" line in Dockerfile. It is stable enough that you can trust CI do that.
I don't think that everyone is capable of or should be implementing csrf protection or cors handling. While the standard library is an awesome starting point, telling people that it is sufficient is not going to convince them.
I mean, I'm keen on a small number of dependencies too, but the smaller the scope of a package you go, I find the more likely they are to be abandoned. The Go JWT library has been forked twice because it became abandoned by the original authors, just to give an example.
Whats so hard about writing an SQL query and some json struct tags? I really dont like LLMs but all the pain points of Go you described above are relatively mute now if you use them for these little things
There's nothing hard about it, it's just that the sum total of code you need to write is much larger, and that takes more time.
> Always choose extremely boring technology. Just use python/Django/Postgres for everything.
Hell, think twice before you consider postgres. Sqlite scales further than most people would expect it to, especially for local development / spinning up isolated CI instances. And for small apps it tends to be good enough for production too.
Sqlite is mostly boring but I've found that there's just slightly more risk of something going wrong because of the way it handles locking between threads. It has tended to misbehave under unexpected load and been difficult to fix in a way that Postgres hasn't.
I'm particularly thinking of workers running tasks, here. It's possible to lock up everything with write transactions or cause a spate of unhandled SQLITE_BUSY errors in cases where Postgres would just keep chugging along.
> especially for local development / spinning up isolated CI instances
I really don't consider it a good idea to use different databases on your production vs development vs CI instances. Just use PostgreSQL everywhere, it doesn't cost anything, scales down to almost nothing, and scales up to whatever you're likely to need in the future.
Local Postgres is easy. If you're deploying to some cloud service, Postgres is still probably easier than SQLite, since persistent local storage is a bit more special there. I'd also be worried that SQLite doesn't provide something I need out of the box.
I have had some bad experiences with Sqlite for local desktop apps with regards to memory usage, especially on MacOs. Insert and delete a few thousands rows per hour, and over a few days your memory usage has ballooned. It seems to cause a lot of fragmentation.
Curious to hear more about your experience, since my impression from hacking around in native Apple software is that pretty much a bunch of it is based on top of sqlite3. The Photos app is a case in point I remember off the cuff.
I think we might be using it in a slightly unusual way: collect data for a while, do one big query once in a while to aggregate/transform the data and clean everything. Rinse and repeat as it's a background app.
So lots of allocations/deallocations. If you're only storing a few key/value pairs long term, you won't have any issues.
What kind of memory usage are you talking about? That sounds wild.
Are you keeping the entire database in memory, or are you flushing to disk?
It seems wild that such a critical and pervasive piece of software would behave like that.
Keeping it in memory. After a few days, the memory usage reported by the Activity Monitor (so not the actual resident memory, but the one customers see and complain about) grows from maybe a few 10s MB to a few hundreds MBs.
But as far as I can tell, it's more an OS issue than really a Sqlite issue, simply doing malloc/free in a loop results in a similar behaviour. And Sqlite does a lot of allocations. We see a similar problem on Windows, but not as pronounced, and there we can force the OS to reclaim memory.
It's probably solvable by using a custom allocator, but at this point it's no longer plug and play the way the GP meant.
It's easier to spin up a RDS instance or equivalent than to set up EBS / EFS just to host a tiny sqlite file in a pvc for me personally
hello,
as always: imho. (!)
while i personally really love SQLite for a lot of use-cases, i wouldn't recommend / use it "in serious production" for a django-application which does more than a simple "hello world".
why!? concurrency ... especially if your application attracts user or you just want to scale your deployment horizontally etc. ;))
so in my opinion:
* why not use sqlite for development and functionality testing
* postgresql or mariadb/mysql for (serious) production-instances :)
just my 0.02€
Yeah, I've also found that foregoing Postgres is one step too far. It's just too useful, especially with Listen/Notify making it a good task queue broker. SQLite is great, but Postgres is definitely worth the extra dependency.
Sqlite is not a very good choice for a typical CRUD app running on a web server. MySQL/Postgres/MariaDB will be much better. You can connect to it from remote and use GUI tools etc. Also much more flexible from an architectural point of view.
Sqlite seems to be the hip new thing to use where MySQL should have been used in the first place. Sqlite is great for many things, but not for the classic CRUD web app.
while i'm currently using sqlite, since it has only one write transaction at a time, if I understand correctly, opening a tx and doing outside requests while in the transaction could potentially block your whole application if the outside requests keep timing out... so you kinda need to watch out for that
Treat transactions like mutexes has always been the prevailing wisdom has not not? Keep them as short as possible and do not make blocking calls within one.
This would be true for any database, something read / written during a transaction would block at least that table universally until the transaction is finalised.
What do you save by going with sqlite vs how much pain are you going to endure in the future?
> Most applications don't need to be single-page apps nor require heavy frontend frameworks. Even for those that can benefit from it, traditional Django views is just fine for 80% of the pages. For the rest, consider AlpineHJS/HTMX
Doesn't that contradict "learn one tool well"?
I write every webpage in React, not because I think everything needs to be an SPA, but because enough things end up needing client-side state that I need something that can manage it well, and at that point it's easier to just do everything in React even if it initially seems like it would be too heavy, just like your example.
Agree with almost everything, but Celery is pretty common in my Django projects. I don't like the complexity cost, but especially when using some PaaS for hosting, it's usually the least painful option. I kinda always start out thinking this time I'll manage without, and then I have a bunch of jobs triggered via HTTP calls running into timeouts. At that point it's either threads, cron jobs (tricky with PaaS) or Celery. What's your approach?
This.
I use celery with the same code base/docker image. Just a different entry point to start a celery worker instead of a wsgi (web) worker.
Too many http requests? Add web worker instances.
Background jobs piling up? Add celery workers.
Clearly separate read endpoints from write/transactional endpoints and you can hit a slave postgres db or the master db depending on the http call.
This creates a very robust system that can scale easily from a single code base.
I do the same, it's easy enough and doesn't require a ton of hosting logic.
Out of interest, how do you run your migrations in production, deploy the service then run an ad-hoc job with the same container again? That was one thing I was never super happy with.
In an ideal world your code-base is always compatible with the previous migration state.
So new version can work with the previous version's DB schema.
Then, yes, simply run the migration in a transaction once the new code is deployed. Postgres has fully transactional DDL changes which helps.
Of course, it heavily depends on the actual change being made. Some changes will require downtime or must be avoided if it is too heavy on the db.
Another approach if the migration can be applied quickly is to just run the migrations as part of the deployment script. This will cause a downtime but can be short.
Easiest is just to do runmigrations in your docker image start commands, so DB is always migrated when the container starts.
tl;dr: It depends on the complexity of the migrations and your uptime requirements.
What Paas do you use? Pythonanywhere has an always on task feature or scheduled tasks where you can run command scripts.
Heroku, AWS Elastic Beanstalk (not a fan), Fly.io. With these, it always boils down to using their Celery/RabbitMQ offering. I also used Ofelia in the past, a simple Docker image for cron jobs, but quickly outgrew it.
Maybe I'm missing it, but I don't think any of these has a nice simple built-in feature.
You can have a queue in your relational DB (example: https://github.com/pgmq/pgmq). It'll scale pretty far.
Fully agree. I would also say it's easy enough to use Django for (almost) everything for a self contained SaaS startup. Marketing can be done via Wagtail. Support is managed by a reusable app that is a simple static element on every page (similar to Intercom) that redirects to a standard Django page, collects some info about the issue including the user who made it (if authenticated) etc.
I try to simplify the stack further and use SQLite with Borg for backups. Caching leverages Diskcache.
Deployment is slightly more complicated. I use containers and podman with systemd but could easily be a git pull & gunicorn restart.
My frontend practices have gone through some cycles. I found Alpine & HTMX too restrictive to my liking and instead prefer to use Typescript with django-vite integration. Yes it means using some of the frontend tooling but it means I can use TailwindCSS, React, Typescript etc if I want.
> Forget Celery
How do you do background job processing without Celery? Every app/website I ever developed required some sort of background job processing, for Python I use Celery, Rails I use Sidekiq, Node.js I use Faktory. One of the biggest drawbacks (at least until a while ago) was that setting this up had to be a hacky webhook call to your own app that allowed up to N seconds request/response.
I've had success with a boring event loop which looks at the state of the system and does all the background work. It's simple to implement and much easier to manage failing tasks that need manual intervention to stop failing. It also makes it easy to implement batching of work as you always know everything that needs to be done in each loop.
I also have multiple celery applications, but I wouldn't recommend it for smaller or simplier apps.
> boring event loop ... state of the system ... background work
Can you explain what you mean?
If you can tolerate 1 sec latency, Django-background-tasks is pretty good.
DB-based queues are pretty common (see outbox pattern). You actually don't have much choice but an outbox in your relational DB anyway, if you want the job to be enqueued transactionally.
You write the demand to the database, and have a worker process execute it. Exactly like you'd do with celery.
Celery is a solution to scalability, not to any business problem in particular.
Your first point resonates. I had an idea I wanted to explore and decided to use Make.com and google sheets. After two hours I said, screw this, and spun up my entire idea in a Rails app in 30 minutes.
Knowing a good Swiss army tool very well is a super power.
Yes, my default strategies are quite similar. I would also add: pre-populate data where you can rather than relying on robust data entry by users.
I feel like Django has the largest RoI of any framework out there
Completely OT and apologies if rude, but you're Gary Numan the musician?
I love finding out when celebrities have talents elsewhere. And Wikipedia says you've had quite a bit of aviation experience as well.
Kinda makes my morning... lol
p.s. The inner city cynical kid in me is now required to throw in that I found Django disappointing even though every else seems to love it. Ok... identity restored... can now go back to shaking my fist at the sky as per usual...
I think Rails is stiff competition, it's just I prefer Python.
Laravel/PHP world is also not bad. It gets bad rep but for CRUD apps the PHP model of every request being isolated is pretty great.
I last touched PHP when you sprinkled it in between HTML (and deployment was FTP-ing it to a server) so I'm well out of date with that though I've heard people say that it's quite nice nowadays.
In Django, how do you create components well and handle user interactions?
It's been a long time (we've gone all React), but I remember doing reusable components with templatetags.
I think that's where htmx comes in
re:componenets Just use jinja templates.
This is great, but it's for a very narrow set of programming problems.
You clearly work on web applications of moderate scale. For example, Kubernetes and Redis can suddenly become almost necessities for a back end service once it reaches a certain scale.
I think scale matters quite a lot here.
If you're building something yourself or in a small team, I absolutely agree with everything written in the post. In fact, I'd emphasize you should lean into this sort of quick and dirty development methodology in such a context, because this is the strength of small scale development. Done correctly it will have you running circles around larger operations. Bugs are almost always easy to fix later for a small team or solo dev operation as you can expect everyone involved to have a nearly perfect mental model of the entire project, and the code itself will regardless of the messes you make tend to keep relatively simple due to Conway's law.
In larger development projects, fixing bugs and especially architectural mistakes is exponentially more expensive as code understanding is piecemeal, the architecture is inevitably nightmarishly complex (Conway again), and large scale refactoring means locking down parts of the code base so that dozens to hundreds of people can't do anything (which means it basically never happens). In such a setting the overarching focus is and should be on correctness at all steps. The economies of scale will still move things forward at an acceptable pace, even if individual developers aren't particularly productive working in such a fashion.
Hmm, context matter a lot. Im not sure on what you consider large development projects, so maybe its even bigger then what I'm thinking on. But getting the apis between apps up and ready early and getting a working setup from the database team to the frontend teams and apps teams trough some kind of backend/api team has always proven to be the correct choice for me. Also getting it as fast as possible on to a production server so its just the dns missing from beeing in production help so much for testing and highlights bug and other problems between teams. So the author mostly talks about this from a code perspective, but IMHO its even more important on larger teams.
(Sidenote: having this kind of architecture where you create layer of deps from one team to another is a bad idea from my point of view, but is still done a lot)
It's a scale. On the one extreme you have solo development, on the other you have the gargantuan code bases at e.g. Google, or the Linux kernel.
What you're describing is somewhere in the middle (if you imagine a logarithmic scale), it's at a point where working like a solo dev begins to break down especially over time, but not at a point where it's immediately catastrophic.
Startups sometimes work in that sort of hybrid mode where they have relatively low quality code bordering on unmaintainability, where they put off fixing its problems into the future when they've made it big.
This is the point at which u scale down. Nobody needs these huge systems but everyone for some reason wants them...
In that case your system is a legitimate candidate for Micro Services.
Services that can each be maintained by a small team, with a clean, understandable API, appropriate protections for data (both security and consistency) and easily predictable cost and behavior.
Then you can compose these services to get more complex functionality.
I've found that a "rough draft" is pretty hard to maintain as a "draft," when you have a typical tech manager.
Instead, it becomes "final ship" code.
I tend to write ship code from the start, but do so, in a manner that allows a lot of flexibility. I've learned to write "ship everywhere," even my test harnesses tend to be fairly robust, ship-Quality apps.
A big part of that, is very high-Quality modules. There's always stuff that we know won't change, or, if so, a change is a fairly big deal, so we sequester those parts into standalone modules, and import them as dependencies.
Here's an example of one that I just finished revamping[0]. I use it in this app[1], in the settings popover. I also have this[2] as a baseline dependency that I import into almost everything.
It can make it really fast, to develop a new application, and can keep the Quality pretty high, even when that's not a principal objective.
[0] https://github.com/RiftValleySoftware/RVS_Checkbox
[1] https://github.com/RiftValleySoftware/ambiamara
[2] https://github.com/RiftValleySoftware/RVS_Generic_Swift_Tool...
Tangent, is it a Swift thing to have "* ################################################################## / comment markers ?
It becomes quickly very visually dominant in the source code:
> / ###################################################################################################################################### / // MARK: - PUBLIC BASE CLASS OVERRIDES - / ###################################################################################################################################### */
i see lots of python with blocks like
(not in this case but) it helps me for long files where modularization would be inconvenientNope. It's a "Me" thing. I write code that I want to see. I have fairly big files, and it makes it easy to scroll through, quickly. It also displays well, when compiling with docc or Jazzy.
My comment/blank line-to-code ratio is about 50/50. Most of my comments are method/function/property headerdoc/docc labels.
Here's the cloc on the middle project:
> For example, if you’re making a game for a 24-hour game jam, you probably don’t want to prioritize clean code. That would be a waste of time! Who really cares if your code is elegant and bug-free?
Hate to be an anecdote Andy here, but as someone who has done a lot of code review at (non-game) hackathons in the past (primarily to prevent cheating), the teams that performed the best were also usually the ones with the best code quality and often at least some rudimentary testing setup.
These two statements do not contradict themselves. That teams have the best code quality does not mean they must have prioritised clean code.
I mean, it's pretty strong evidence that they value good code. You don't get quality by accident. (But you do get it by habit!)
But, in another sense, they didn't "prioritize" good code because it isn't really a tradeoff. They're just better.
The gaming use case is what makes this apt advice. If you've got 24h to make a game and you're spending more than ~1h worrying about the source code cleanliness, I don't think it's gonna go well.
Systems like UE blueprints showcase how pointless the pursuit of clean anything is when contrasted with the resulting product experiences.
If you're an experienced developer, writing clean code doesn't add any more time than writing shitty code. It's just ingrained habits of what makes for the best productivity.
I think some people look at code cleanliness in aggregate (this code base is dirty vs. it is elegant and bug free) and others have a fine grained cost/benefit analyses of nonfunctional code improvements.
Im pretty sure the latter vastly outpeform the former under all circumstances - whether quick and dirty hackathon or ultra hardened production code.
> It can reveal “unknown unknowns”. Often, prototypes uncover things I couldn’t have anticipated.
This is the exact opposite of my experience. Every time I am playing around with something, I feel like I'm experiencing all of its good and none of its bad ... a honeymoon phase if you will.
It's not until I need to cover edge cases and prevent all invalid state and display helpful error messages to the user, and eliminate any potential side effects that I discover the "unknown unknowns".
I think you're talking about unknown unknowns in the tool/framework/library. I think the author is talking about unknown unknowns in the problem space.
I was talking about both. Sometimes even in a problem space time constraints demand that you utilize something off the shelf (whether you use part of it or build on top of a custom version of it).
Tools aside, I think everyone who has 10+ years can think of a time they had a prototype go well in a new problem space only to realize during the real implementation that there were still multiple unknown unknowns.
Yeah, typically when you start thinking something through and actually implementing stuff you can notice that some important part of the behaviour is missing and it might also be something that means that the project is no longer feasible
I think this applies to both tools/frameworks/libs and problem spaces
Yes. I wanted to warn about a rough draft being too rough. There are corners one shouldn't cut because this is where the actual problems are. I guess that rally pilots do their recon at a sustained pace, otherwise they might not realize that e.g. the bump there before the corner is vicious.
Ye it is something like how making tools just you yourself use is so smooth. Like, they can be full of holes and be a swaying house of cards in general but you still can use them sucessfully.
> For example, if you’re making a game for a 24-hour game jam, you probably don’t want to prioritize clean code. That would be a waste of time! Who really cares if your code is elegant and bug-free?
Having worked on some 24-hour game jams and similar, I've found completely the opposite. It's when you're in a real hurry that you really can't afford bad code. Writing better code will make it easier to get it right, will put less pressure on my working memory, will let me try things faster and make those last-minute changes I wanted, will make adding features towards the end easier rather than harder and, crucially, will both reduce the chance that I need to do intense debugging and make debugging I need to do easier.
Working with good code just feels lighter.
The thing that breaks 24-hour projects isn't writing code too slowly, it's writing yourself into a corner or hitting some problem that derails your work, takes hours to solve or doesn't even get resolved until after the deadline.
A game jam isn't the place to try to squish all bugs, sure, but that's a question of what you're doing, not how. I still want to write good code in that situation because it makes my life easier within the time constraints, and because, even if I'm fine with some bugs, there are still lots of bugs that render a game unpleasant or unplayable.
I'll need to fix some bugs no matter what; I'd rather fix fewer, easier bugs than more, harder bugs!
The same thing applies to longer time horizons too. When you have more time you have more slack to deal with bad code, but that doesn't mean it makes any more sense to write it!
And, of course, once you get in the right habits, writing solid quality code becomes basically free... but, even if it really did meaningfully slow you down, chances are it would still be worth doing in expectation.
There is something to this, but I'm fairly convinced that the key to writing fast clean code is to write more code. Pretty much period.
Sucks, as that is effectively arguing for rote repetition in the tasks. And it is. But, that works. Really well.
Stated differently, show me someone that can write clean code in a hurry, and you have shown me someone that has written this before.
I second this. I've done lots of game jams and I think the "messy code" threshold for me is like, 1-2 hours away from the deadline at most, on files nobody else will touch. It depends on the type of cleanup, but factoring out common logic really doesn't take that long.
As the above comment says, in my experience bugs introduced from messy code are way more likely than the time savings of not cleaning up code.
The usual exception I'd make are things that like, mostly the same but not quite (e.g. a function to fade out a light over time vs a function to fade out a color over time). Often I find requirements for those diverge over time, so I'll just leave some repeated patterns across both.
Well, you probably would disregard some fancy asset loader and instead just statically include some files instead.
Or if you need to do some pathing and your quickest solution is some breadth first search.
Perhaps it isn't "bad code", but still a bad solution that could quickly be implemented and be solved by a lot of computing power.
Of course you may use ready-to-use modules that provide such features as well..., but it may be prohibited by competion rules, I don't know...
Using a different algorithm is a change in what you're doing, not how you're doing it, so I'd see that as qualitatively different from writing bad code.
I agree.
I think it's a misconception that writing good code must take longer than writing bad code. At least if you want it to vaguely satisfy some requirements.
> What is my team’s idea of “good enough”? What bugs are acceptable, if any? Where can I do a less-than-perfect job if it means getting things done sooner?
I hop between projects regularly, and this has been the biggest source of inter-team conflict in my career.
Different people from different backgrounds have an assumed level of what "good enough". The person from big tech is frustrated because no one else is testing thoroughly. The person from a startup is frustrated because everyone else is moving too slow.
It would be nice if the "good enough" could be made explicit so teams are on the same page.
This is a team charter. A “how we work” document.
https://asana.com/resources/team-charter-template
Isn't the current layoff-heavy tech world the biggest threat to software quality and engineer productivity?
The perpetual looming threat of layoffs, the need to deliver wins ASAP, stifles creativity, punishes experimentation, and pushes people to burnout. It forces people into groupthink about topics like AI. Nobody can say the emperor has no clothes (emperor being leadership or the topic du-jour)
Forget LLM coding, solve this problem...
The biggest threat to software quality, has always been and will always be that consumers don't pay for quality.
Consumers that have good taste (or at least perceive differences in quality) are not numerous enough to support new products that differentiate themselves only with quality. And they (the consumers) are not successful enough in their own enterprises to pay extra for better quality products.
It's easier to find examples where people do pay for quality outside of software. Look at the spectrum in quality available for vehicles or household appliances.
Yup. As engineers, we MIGHT care about code quality. The end user just cares if something works in the way they want/expect it to. A lot of large successful companies have rough code quality.
The biggest threat is vendor-lockin at the programming level, far more destructive than SAAS lockin. We already have monopolies in hardware now we are about to have monopolies in software by the same companies who monopolized the hardware. Giving them so much power that there will no longer be computer programmers, there will only be LLM prompters.
I agree with you
An important dimension that is not really touched upon in the article is development speed over time. This will decrease with time, project and team size. Minimising the reduction rate may require doing things things that slow down immediate development for the sake of longer term velocity. Some examples would be test, documentation, decision logs, Agile ceremonies etc.
Some omissions during initial development may have a very long tail of negative impact - obvious examples are not wiring in observability into the code from the outset, or not structuring code with easy testing being an explicit goal.
Even as a solo developer, I can swear by decision logs, test and documentation, in that order. I personally keep a "lab notebook" instead of a "decision log" which chronicles the design in real-time, which forms basis of the tests and documentation.
Presence of a lab notebook allows me to write better documentation faster, even if I start late, and tests allow me to verify that the design doesn't drift over time.
Starting blind-mindedly for a one-off tool written in a weekend maybe acceptable, but for anything going to live longer, building the slow foundation allows things built on this foundation to be sound, rational (for the problem at hand) and more importantly understandable/maintainable.
Also, as an unpopular opinion, design on paper first, digitize later.
Right, an important part of keeping in mind other future developers working on your codebase. You 6 months later is that other developer once the immediate context is gone from your head. :)
That's very true. I like to word this a little differently:
> Six months ago, only I and God knew how this code worked. Now, only God knows. :)
This is very familiar. Rough draft, some manual execution often wrapped in a unit test executor, or even written in a different scripting language just to verify the idea. This often helped me to show that we don't even want to build the thing, because it won't work the way people want it to.
The part about distraction in code feels also very real. I am really prone to "clean up things", then realize I'm getting into a rabbit hole and my change grows to a size that my mates won't be happy reviewing. These endeavors often end with complete discard to get back on track and keep the main thing small and focused - frequent small local commits help a lot here. Sometimes I manage to salvage something and publish in a different PR when time allows it.
Business mostly wants the result fast and does not understand tradeoffs in code until the debt hits the size of a mountain that makes even trivial changes painfully slow. But it's about balance, which might be different on different projects.
Small, focused, simple changes definitely help. Although, people are not always good at slicing a larger solution into smaller chunks. I sometimes see commits that ship completely unused code unrelated to anything with a comment that this will be part of some future work...then prio shifts, people come and go, and a year later we have to throw out all of that, because it does not apply to the current state and no one knows anymore what was the plan with that.
> Data modeling is usually important to get right, even if it takes a little longer. Making invalid states unrepresentable can prevent whole classes of bugs. Getting a database schema wrong can cause all sorts of headaches later
So much this.
Get the data model right before you go live, and everything is so simple, get it wrong and be prepared for constant pain balancing real data, migrations, uptime and new features. Ask me how I know
I came of age in SW dev when we started with the (database)schema. THis doesn't seem to be common any more and I regularly see experienced devs with low to no SQL exposure. Seems they typically work at an abstraction (or 2 or 3) above the API or maybe the ORM, but would struggle to write the resultant query, let alone profile it.
I'm not convinced this was a good abstraction that really helps us be more effective.
APIs, data models and architecture are the main things you can't Agile your way out of. You need to get them right up front before you start iterating on the implementation.
We encounter many rough drafts (yours) in production systems. If the original devs are still there, it is usually something along the lines of: I showed the rough draft to my manager, they flagged is as done and I was assigned to another task.
I see this an awful lot. Msot recently was presentation of hackathon projects with product managers & executives asking how much work was left to turn them into production features. It's pretty obvious how their brains are spinning.
This will get worse with AI
We already see it; the combination of job-hopping & AI are a perfect storm really.
From launch to failure is definitely getting fast-tracked; few months ago we had yet another hospital system that just lost data; reading the code (no tests, no peer reviews; they don't use versioning) shows clear signs of LLMs; many files that do almost the same thing, many similar function names that do almost the same thing or actually the same thing, strange solutions to trivial problems, etc. The problem was that there was a script ran at startup which added an admin user, but if an admin user already exists, it truncates the table. No idea why, but it wasn't discovered earlier because after testing by the devs it was put live, devs left (contractors) and it just ran without issues until the ec2 instance needed maintenance by aws and was as such rebooted after which all users were gone. Good stuff. They paid around 150k for it; that is not a lot in our field but then to get this level of garbage is rather frightening. This was not life threatening, but I know if it was, it would not be better as you cannot make this crap up.
And let us be very clear: this happened REGULARLY pre-AI, with long-lived systems adding contractors 12, 6 or less months over 20 or more years. Now we have AI that allows even faster iterations of this with even less conceptual integrity.
The big problem: the decision makers(c-suite executives) never really understood what was happening before, so you can't expect them to see the root cause of the problem we're actively creating. This means it will not get the attention and resourcing needed - plus they'll be gone and on to the next one after taking a huge payday for slashing their R&D costs.
Agreed.
I wonder if the rough draft approach is a good prompt for an agent. Since it can draft more quickly, you can review more quickly & get it on the right track.
A lot of established dev practices - like this one - are effective with AI generation. Another is the super-valuable but less common product spec that spends a lot of effort and verbiage defining what is NOT included. LLMs are helped greatly with explicit guardrails and restrictions.
I fear this is different from the "code slop jello poured over a bespoke marshmallow salad of a system" problem though. Mostly for the same reasons that Brooks described that make SW inherently hard 60+ years ago. It feels like the JS framework / SPA experience but with every.single. developer. and the 10x "improvement" is just speed.
Work with bad companies, be surprised by poor managers? Who is the "we" in this context, I assume an agency?
So that's not a problem with this process itself. You're describing problems with managers, and problems with developers being unable to handle bad managers.
Even putting aside the manager's incompetence, as a developer you can mitigate this easily in many different ways, here's a few:
- Just don't show it to management
- Deliberately make it obviously broken at certain steps
- Take screen shots of it working and tell people "this is a mockup, I still have to do the hard work of wiring it up"
It's all a balancing act of needing to get feedback from shareholders and managing expectation. If your management is bad, you need to put extra work into managing expectations.
It's like the famous duck story, from Jeff Atwood (see jargon number 4), sometimes you have to manage your managers:
https://blog.codinghorror.com/new-programming-jargon/
Sure, but we actually thrive here; my company gets called in when systems are not functioning, badly broken, etc and they cannot fix it themselves (usually because the people who built it are gone for decades and they just kept it running with ductape for this time). We never stay for long, we just patch the system and deliver a report. But for figuring out what went wrong and writing the report, we find out how it got to be that way and it's always the same; they suck. Talking banks, hospitals, factories, it really doesn't matter; it's all garbage what gets written and 'TODO: will refactor later' is all over the place. We see many companies from the inside and let me tell you; HN is a lovely echo chamber that resembles nothing in the real world.
I think the main lesson here is that most entities shouldn't be writing serious software themselves, but purchase software from reputable software companies whenever possible. At least who to hold responsible or sue is clearer in that case.
I actually try to build it "well" in the first pass, even for prototyping. I'm not gonna say I succeed but at least I try.
This doesn't mean writing tests for everything, and sometimes it means not writing tests at all, but it means that I do my best to make code "testable". It shouldn't take more time to do this, though: if you're making more classes to make it testable, you're already messing it up.
This also doesn't mean compromising in readability, but it does mean eschewing practices like "Clean Code". Functions end up being as large as they need to be. I find that a lot of people doing especially Ruby and Java tend to spend too much time here. IMO having lots of 5-line functions is totally unnecessary, so I just skip this step altogether.
It also doesn't mean compromising on abstractions. I don't even like the "rule of three" because it forces more work down the line. But since I prefer DEEP classes and SMALL interfaces, in the style of John Ousterhout, the code doesn't really take longer to write. It does require some thinking but it's nothing out of the ordinary at all. It's just things that people don't do out of inertia.
One thing I am a bit of hardliner about is scope. If the scope is too large, it's probably not prototype or MVP material, and I will fight to reduce it.
EDIT: kukkeliskuu said below "learn one tool well". This is also key. Don't go "against the grain" when writing prototypes or first passes. If you're fighting the framework, you're on the wrong path IME.
I personally find that doing it well in the first pass slows me down and also ends up in worse overall designs.
But I am also pretty disciplined on the 2nd pass in correcting all of the hacks and rewriting everything that should be rewritten.
There are two problems I have with trying to do it right the first time:
- It's hard to know the intricacies of the requirements upfront without actually implementing the thing, which results in designing an architecture with imperfect knowledge
- It's easy to get stuck in analysis paralysis
FWIW I am a huge fan of John Ousterhout. It may be my all time favorite book on software design.
I have found that too much coupling between product requirements and the architecture can be detrimental. It's often the reason why people tend do too much upfront work, but also slows down the evolution of the feature.
So I don't really want to know the future requirements, or refactor on the 2nd pass to "match".
If some feature needs too many modifications or special cases in the current architecture, it's a round peg in a round hole. I prefer to have those places be a bit more "painful" in the code. The code doesn't have to be bad per se, but it should be clear that something different and not traditional is happening there.
This pretty much exactly describes my strategy to ship better code faster. Especially the “top down” approach: I’m actually kind of surprised there isn’t like a “UI first” or “UI Driven Development” manifesto like w TDD or BDD. Putting a non functional UI in front of stakeholders quickly often results in better requirements gathering and early refinement that would be more costly later in the cycle.
Why not build a simple functional UI, its not a huge time sink between non functional and functional (as long as kept simple)
Well, sometimes I will, but for example take a simple list+form ontop of a database. Instead of building the UI and the database and then showing the stakeholder, who adds/renames fields, changes relationships etc. I will intentionally build just the UI not wired up to database. Sometimes just to an in-memory store or nothing. Then, _after_ the stakeholder is somewhat happy with the UI, I "bake" things like a service or data layer, etc. This way the changes the stakeholder inevitably has up front have less of an impact.
Well, most of the times people I worked with preferred something earlier even if just by a few days that they could see and comment on. Maybe that is why for him too.
I call it "outside in", but sometimes like to de-risk a lower level component before investing in the UI.
This post resonates deeply with how I build products, especially in the era of LLMs and AI-assisted coding.
I usually start top-down, sketching the API surface or UI scaffold before diving into real logic. Iteration drives the process: get something running, feel out the edges, and refine.
I favor MVPs that work end-to-end, to validate flow and reduce risk early. That rhythm helps me ship production-ready software quickly, especially when navigating uncertainty.
One recent WIP: https://zero-to-creator.netlify.app/. I built it for my kid, but I’m evolving it into a full-blown product by tweaking the edges as I go.
One important aspect, also highlighted by others, is that for the long term you actually _don't_ want to focus solely on the immediate task you're solving. Sure, short term the tasks are getting done quicker, but since the end goal typically is implementing a full coherent solution you _have_ to step back and take a look at a bigger picture every now and then. Typically you won't be allocated specific time when to do this, so this "take a bird's eye view" part has to be incorporated into day-to-day work instead. It's also typically easier to notice bigger issues while you're already in the trenches, compared to doing "cleanup" separately "later".
Just one thing I'd like to throw out here after far too long in the industry: It's very hard to tell ahead of time just how users will put computers/software to use. Lots of generic, off-the-shelf HW and SW get used in ways that are ultimately 'life critical' to someone.
Something to keep in mind for design/development/testing.
The initial rough draft almost reminds me of the old "Build One to Throw Away" approach, which I think is pretty nice - not getting caught up in making something production ready, but rather exploring the problem space first.
I do admit that modern frameworks also help in that regard, instead of just stitching libraries together, at least for typical webdev stuff instead of more minimalistic CLI utilities or tools. The likes of Ruby on Rails, Django, Laravel, Express and even the likes of ASP.NET or Spring Boot.
Where I live we have this saying about building houses: Build one for your enemy, then build one for your friend, and then build your own one.
I like the writing style. It is simple and effective.
The author said LLM helps. Let's lynch him!
I love his writing too! I read this post a few days ago and really liked it, so I started going through his older posts. It's no coincidence that his writing is good—he's actively working to improve it: https://evanhahn.com/economist-style-guide-book-takeaways/.
Of all the comments the post received, this one meant the most to me. Thank you.
How I build software quickly: get rid of team members first, communication slows things down.
What the article calls "rough draft" I like to call "executable architecture", a nice term from the "rational unified process".
When possible, I try to use real data for both volumetry and heterogeneity testing.
It helps reveal unknowns in the problem space that synthetic data might miss.
This is very important and requires some foresight when the real data is personally identifiable information, private health information, etc.
It's possible, but requires designing a safe way to run pre-production code that touches production data. Which in practice means you better be sure you're only doing reads, not writes, and running your code in the production environment with all the same controls as your production code.
You are right. I have a pre-production environment with a copy of production data and a script that scramble names and personal infos.
I try to do UX design with real data too. Not sure if that is what you mean with heterogeneity?
Not quite UX-focused, but related
I meant data heterogeneity - the variety in formats, edge cases, and data quality you encounter in production. Real user data often has inconsistencies, missing fields, unexpected formats, etc. that synthetic test data tends to miss.
This helps surface integration issues and performance bottlenecks early.
>(I recently had a branch where an error message was logged 20 times per second.)
Lightwork
i often use c# and Visual Studio to write prototype code. C# can be used in a C like syntay (my destination language) and has much better turnaround times
Building software quickly seems to mostly come from having enough examples of code you've already built that you can pull from.
I recently re-made my web based notes app. Before working on this project I made a web based S3 file manager (e.g. CRUD operations in my own UI).
Instead of trying to store notes in a database or something, I just yoinked the S3 file manager code and store my notes in S3. Just a few tweaks to the UI and a few more features and now I have a notes app.
Fast builds are important. I've been doing server side stuff for a few decades and there are some things you can do to turn slow builds into fast builds. I mostly work on the JVM but a lot of this stuff ports well to other stacks (e.g. Ruby or python).
Basically there are things you can't avoid that are not necessarily fast (e.g. compilation, docker build, etc.) and things that you can actually control and optimize. Tests and integration tests are part of that. Learning how to write good effective tests that are quick to run is important. Because you might end up with hundreds of those and you'll be spending a lot of your career waiting for those to run. Over and over again.
Here's what I do:
- I run integration tests concurrently. My CPUs max out when I run my tests. My current build runs around 400 integration tests in about 35 seconds. Integration test means the tests are proper black box tests that hit a REST API with my server talking to a DB, Elasticsearch and Redis. Each test might require users/teams and some content set up. We're talking many thousands of API calls happening in about 35 seconds.
- There is no database cleanup in between tests. Database cleanup is slow. Each build starts with an ephemeral docker container. So it starts empty but by the time the build is over you have a pretty full database.
- To avoid test interaction, all data is randomized. I use a library that generates human readable names, email addresses, etc. Creating new users/teams is fast, recreating the database schema isn't. And because at any time there can be 10 separate tests running, you don't want this anyway. Some tests share the same read only test fixture and team. Recreating the same database content over and over again is stupid.
- A proper integration test is a scenario that is representative of what happens in your real system. It's not a unit test. So the more side effects, the better. Your goal is to find anything that might break when you put things together. Finding weird feature interactions, performance bottlenecks, and sources of flakiness is a goal here and not something you are trying to avoid. Real users don't use an empty system. And they won't have it exclusive to themselves either. So having dozens of tests running at the same time adds realism.
- Unit tests and integration tests have different goals. With integration tests you want to cover features, not code. Use unit tests for code coverage. The more features an integration test touches, the better. There is a combinatorial explosion of different combinations of inputs. It's mathematically impossible to test all of them with an integration test. So, instead of having more integration tests, write better scenarios for your tests. Add to them. Refine them with detail. Asserting stuff is cheap. Setting things up isn't. Make the most of what you setup.
- IMHO anything in between scenario tests and unit tests is a waste of time. I hate white box tests. Because they are expensive to run and write and yet not as valuable as a good blackbox integration test. Sometimes you have to. But these are low value, high maintenance, expensive to run tests. A proper unit tests is high value, low maintenance and very fast to run (it mocks/stubs everything it needs, there is no setup cost). A proper integration tests is high value, low maintenance, and slow to run. You justify the time investment with value. Low maintenance here means not a lot of code is needed to set things up.
- Your integration test becomes a load and stress test as well. Many teams don't bother with this. I run mine 20 times a day. Because it only takes less than a minute. Anything that increases that build time, gets identified and dealt with. My tests passing gives me a high degree of certainty that nothing important has broken.
- Most of the work creating a good test is setting up the given part of a BDD style test. Making that easy with some helper functions is key. Most of my tests require users, teams, etc. and some objects. So I have a function "createTeam" with some parameters that call all the APIs to get that done. This gets called hundreds of time in a build. It's a nice one liner that sets it up. Most of my tests read like this: create a team or teams, do some stuff, assert, do more stuff, assert, etc.
- Poll instead of sleeping. A lot of stuff happens asynchronously so there is a lot of test code that waits for shit to happen. I use kotest-assertions which has a nice "eventually" helper that takes a block and runs that until it stops throwing exceptions (or times out). It has configurable interval that it tries again that backs off with increasing sleep periods. Most things just take a second or two to happen.
- If your CPUs are not maxed out during the test, you need to be running more tests, not less. Server tests tend to be IO blocked, not CPU blocked. And your SSD is unlikely to be the bottleneck. We're talking network IO here. And it's all running on localhost. So, if your CPUs are idling, you can run more tests and can use more threads, co-routines, whatever.
- Get a decent laptop and pay for fast CI hardware. It's not worth waiting 10 minutes for something that could build in about a minute. That speedup is worth a lot. And it's less likely to break your flow state.
This stuff is a lot easier if you engineer and plan for it. Introducing concurrently running tests to a test suite that isn't ready for it can be hard. Engineering your tests to be able to support running concurrently results in better tests. So if you do this properly, you get better tests that run faster. Win win. I've been doing this for a while. I'm very picky about what is and isn't a good test. There are a lot of bad tests out there.
I find people overfocus on fast running tests, often to the exclusion of tests which test realistically and loosely couple to the code.
This is a pretty easy and natural thing to do because it's quite easy to go "I shaved 2.5 minutes off my build" whereas "I increased the maintainability and realism of our tests, adding 3 minutes to the build" is a much more nebulous and hard thing to justify even when it does save you time in the long run.
As Drucker says, what gets "measured gets managed" <- quantifiable metrics get more attention even when they're less important.
>A proper unit tests is high value, low maintenance and very fast to run (it mocks/stubs everything it needs, there is no setup cost).
^^ this is a case in point, mocks and stubs do make fast running test code but they commensurately decrease the realism of that test and increase maintenance overhead. Even in unit tests I've shifted to writing almost zero mocks and stubs and using only fakes.
I've had good luck writing what I call "end to end unit tests" where the I/O boundary is faked while everything underneath it is tested as is, but even this model falls over when the I/O boundary you're faking is large and complex.
In database heavy applications, for instance, so much of the logic will be in this layer that a unit test will demand massive amounts of mocks/stubs and commensurate maintenance and still tell you almost nothing about what broke or what works.
Fast and comprehensive are not mutually exclusive goals. Having fast tests makes it more likely you'll add more and better tests as well. Because the cost of doing that gets lower. I have some pretty comprehensive tests that setup fairly complicated scenarios. The cost for that is low.
A slow test will be something you avoid running when you should be. When it takes 20 minutes to validate a change you did it gets tempting to skip it or you post pone doing it. Or you'll do it and get side tracked by something else. The ability to run high quality integration tests quickly is a super power. We're used to these things running slowly but my point is that you can engineer it such that it's fast and that's worth doing.
IMHO a key mistake is treating integration tests as unit tests. Which then dictates you do silly things like running expensive cleanup logic and isolating tests for each other and giving them their own virgin system to run against. That actually makes your tests less valuable and more tedious to run.
The real world is messy. A good integration test benefits from the noise created by lots of other tests running. It's the closest you can get to a real running live system without using the actual live running system and testing in production. Real users will never see a virgin system and they won't be alone in the system. It's OK for there to be data in the database. You can isolate through other means: give tests their own users. Randomize key things so they don't clash with other tests, etc. This results in better tests that actually run faster.
I love my unit tests as well. But I don't unit test things that I cover with an integration test anyway. I reserve those for things that are proper units that I can easily test in isolation. Anything with complicated logic, regular expressions, or algorithms basically. Testing that with an integration tests is counter productive because your goal is to test the logic and you probably want to do that with lots of different inputs. And then you mock/fake anything that just gets in the way of testing that.
But unit testing APIs is silly if you are in any case writing proper full blown integration / scenario tests that use those APIs. I don't need to unit test my database layer with an in memory database either. If it's at all important, that functionality will be used as part of my integration tests triggering logic that needs a database. And it will run on a proper database. And I can use my API from the outside to evaluate the output and assert everything is as it should be without poking around in that database. This adds more API calls and realism to the test and ensures I don't need to make assumptions about the implementation. Which then means I can change the implementation and validate that it didn't break anything important.
This is why I have grown to appreciate gradual typing, at least for solo projects. In Python-land I can just riff over a few functions/scripts until I get a rough idea of the APIs/workflows I want, then bring mypy into the mix and shape things into their "final" form (this takes maybe a few hours away). Rinse repeat for each new feature, but at every iteration you build up from a "nicely-typed" foundation.
Sometimes a redesign of the types you relied on becomes necessary to accommodate new stuff, but that would be true under any language; otoh, the "exploratory" part of coding feels faster and easier.
"Learn to walk before you run. I firmly believe in this wisdom. First build a foundational model, then tackle boundary challenges and efficiency optimization."