Skip to content

[Python] Invalidation of rule GCI75 (String concatenation in loops): Massive memory overhead of .join() #160

@harag7810

Description

@harag7810

Context:
Rule GCI75 (Don't concatenate strings in loops) is currently pending analysis for Python in the RULES.md file.
The description of this rule recommends avoiding repeated memory allocations by using a StringBuilder object. Since a StringBuilder does not exist natively in Python, we simulated its behavior using the standard equivalent best practice: accumulating text chunks in a list (mutable array), followed by a final assembly using "".join().
Our team (Les Buzz du decollage permanent) conducted a comprehensive benchmark to verify if this alternative provided a real ecological benefit (CPU and RAM) compared to the += operator.

Test Protocol:
We scripted a stress-test comparing += and our StringBuilder simulator (.join()) on loops of different sizes (10k, 100k, and 1 million iterations).
Speed (timeit): Execution time measurement (Garbage Collector disabled to isolate raw compute time).
Memory (tracemalloc): Measurement of the peak allocated RAM.

Benchmark Results:

Image Image

Analysis:
While our alternative using .join() is indeed slightly faster (marginal CPU gain), it results in a RAM overconsumption of nearly 300% compared to +=.

Conclusion & Recommendation:
Just as this rule was rejected for Java (JDK 8 optimizations making it obsolete), CPython's internal optimizations make it counterproductive in Python regarding RAM sobriety.

Required Action:
I propose changing the status of rule GCI75 from pending to Not applicable for Python in the RULES.md file and closing any related implementations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Ready

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions