And an interesting thing related: as soon as I got completely rid of the espconn interface and the lwip interface was working, my througput for both udp and tcp went up from ~260 kbytes/s to ~750 kbytes/s. Also I have lots of dram memory more spare. I think that say "something" (again) about Espressif coding quality.
I must say I cheat a bit, by having my application code "fragment" oversize data (> udp/tcp payload) itself. The output buffer (4k) is completely filled by the application. It is handed to lwip as "readonly/no-copy" data in payload chunks. Every time a chunk is sent or acknowledged, the next chunk from the same buffer is sent. So lwip doesn't have much left to do, just arp and tcp ack.
Did you all know that udp packets are always sent directly but tcp packets are only sent in the background (never from the code that does tcp_send)? I just found out. The espconn interface hides this fact.