We tried to enable LRU for some tracked functions in ty but it resulted in a 25% performance reduction when running ty multi-threaded. The root cause is that the LRU linked list uses a single Mutex, which becomes a bottleneck for very hot queries, because all threads end up blocking on it.
We should explore whether we can shard the LRU lock in a similar way to how we do for interned structs. This should dramatically reduce lock congestion.