Regarding Allocations on Web Apps

Typically, these are allocations happening on web app:

Thread/coroutine/goroutine/stack-like for each HTTP request
Alloc for network buffer
Web framework per request context
Business object from network
Creating DB queries from business objects
Waiting, receiving db results, and SerDe to business objects
Business objects to HTTP response
Intermediate state when SerDe-ing network/db buffer to business object/framework object
Naive allocation-heavy logging framework

Basically, most web apps spent too much time doing allocs. CPU time spent allocating means time not used to do business logic. For managed language, more allocations also means more time needed when GC kicks in, causing another stalls

To solve it:

Reuse request/response buffer/object (pooling). Mostly have same shapes, and relatively close size (because mostly same type of request). Can use libs like fasthttp, etc. For typical static response (error object, OK without context specific data, etc) initiate once, reuse
Use non-serializing lib (like jsonparser/fastjson, and soon molecule). Or at least use non-heavy allocating ones (like easyjson or gogo protobuf).
Need to start with storage (DB, queue, cache, etc) clients that does less allocs. If possible, also native zero allocs API (e.g. flatbuffers/capnproto/custom)
Use low allocation logging library (like zerolog or zap)
Reuse intermediary business objects. Or if possible, make sure it is allocated on the stack.

Note that even though reducing allocation is good, we still need to balance between doing so with code clarity.