Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug when constructing a string from an array of characters #54275

Closed
d-netto opened this issue Apr 26, 2024 · 4 comments · Fixed by #54331
Closed

Possible bug when constructing a string from an array of characters #54275

d-netto opened this issue Apr 26, 2024 · 4 comments · Fixed by #54331

Comments

@d-netto
Copy link
Member

d-netto commented Apr 26, 2024

Consider the MWE provided by @tveldhui:

function foo()
    String(UInt8['a' for i in 1:10000000])
end

function test()
    GC.gc(true)
    baseline = Base.gc_live_bytes()
    while true
        foo()
        GC.gc(true)
        print("gc_live_bytes = +$((Base.gc_live_bytes() - baseline)/1024^2)MiB\n")
    end
end

test()

If we run it, gc_live_bytes seems to increase at a rate of 10MB/iteration:

gc_live_bytes = +514.9888172149658MiB
gc_live_bytes = +524.5256814956665MiB
gc_live_bytes = +534.0625457763672MiB
gc_live_bytes = +543.5994100570679MiB
gc_live_bytes = +553.1362743377686MiB
gc_live_bytes = +562.6731386184692MiB
gc_live_bytes = +572.2100028991699MiB
gc_live_bytes = +581.7468671798706MiB
gc_live_bytes = +591.2837314605713MiB
gc_live_bytes = +600.820595741272MiB

I'm not totally familiar with the Julia array implementation, but this seems to be stemming from a bug in the jl_array_to_string implementation:

JL_DLLEXPORT jl_value_t *jl_array_to_string(jl_array_t *a)
{
    size_t len = jl_array_len(a);
    if (len == 0) {
        // this may seem like purely an optimization (which it also is), but it
        // also ensures that calling `String(a)` doesn't corrupt a previous
        // string also created the same way, where `a = StringVector(_)`.
        return jl_an_empty_string;
    }
    if (a->flags.how == 3 && a->offset == 0 && a->elsize == 1 &&
        (jl_array_ndims(a) != 1 ||
         ((a->maxsize + sizeof(void*) + 1 <= GC_MAX_SZCLASS) == (len + sizeof(void*) + 1 <= GC_MAX_SZCLASS)))) {
        jl_value_t *o = jl_array_data_owner(a);
        if (jl_is_string(o)) {
            a->flags.isshared = 1;
            *(size_t*)o = len;
            a->nrows = 0;
            a->length = 0;
            a->maxsize = 0;
            return o;
        }
    }
    a->nrows = 0;
    a->length = 0;
    a->maxsize = 0;
    return jl_pchar_to_string((const char*)jl_array_data(a), len);
}

On the latest commit of our fork at the time of writing (and also on stock 1.10), it seems like the jl_pchar_to_string called at the very end of jl_array_to_string will allocate a separate buffer for itself and copy the contents of the array into it:

JL_DLLEXPORT jl_value_t *jl_pchar_to_string(const char *str, size_t len)
{
    jl_value_t *s = jl_alloc_string(len);
    if (len > 0)
        memcpy(jl_string_data(s), str, len);
    return s;
}

This raises the question: if a new buffer is allocated to the string, and the contents of the array buffer are copied into it, what's the purpose of these three lines?

    a->nrows = 0;
    a->length = 0;
    a->maxsize = 0;

Seems like we should not have them.

Commenting these three lines seems to solve the issue raised in the MWE above, but again, I'm not fully knowledgeable in this part of the codebase to know whether this proposed change makes sense.

Patch:

diff --git a/src/array.c b/src/array.c
index 5226c729d3..628d40a727 100644
--- a/src/array.c
+++ b/src/array.c
@@ -479,9 +479,9 @@ JL_DLLEXPORT jl_value_t *jl_array_to_string(jl_array_t *a)
             return o;
         }
     }
-    a->nrows = 0;
-    a->length = 0;
-    a->maxsize = 0;
+    // a->nrows = 0;
+    // a->length = 0;
+    // a->maxsize = 0;
     return jl_pchar_to_string((const char*)jl_array_data(a), len);
 }

And new results after applying it:

gc_live_bytes = +0.0049285888671875MiB
gc_live_bytes = +0.005049705505371094MiB
gc_live_bytes = +0.0051708221435546875MiB
gc_live_bytes = +0.005414009094238281MiB
gc_live_bytes = +0.005413055419921875MiB
gc_live_bytes = +0.005534172058105469MiB
gc_live_bytes = +0.0056552886962890625MiB
gc_live_bytes = +0.005898475646972656MiB
gc_live_bytes = +0.00589752197265625MiB
gc_live_bytes = +0.006018638610839844MiB
@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 26, 2024

I don't think any of those fields exist anymore. Are you sure you are looking at master and not an older release?

@d-netto
Copy link
Member Author

d-netto commented Apr 26, 2024

Ah sorry, that's on our fork of Julia, which is a patched 1.10 (will edit the comment above to provide a clarification on that).

But yes, the issue reproduces on stock 1.10.

@oscardssmith
Copy link
Member

one thing worth checking. does the actual memory use increase or is it just a live_bytes reporting issue?

@d-netto
Copy link
Member Author

d-netto commented Apr 26, 2024

RSS stays the same, just live_bytes increases. This makes sense if you take a look at jl_gc_free_array:

static void jl_gc_free_array(jl_array_t *a) JL_NOTSAFEPOINT
{
    if (a->flags.how == 2) {
        char *d = (char*)a->data - a->offset*a->elsize;
        if (a->flags.isaligned)
            jl_free_aligned(d);
        else
            free(d);
        gc_num.freed += jl_array_nbytes(a);
        gc_num.freecall++;
    }
}

The fact that the length is zero-ed out here won't change the fact that free still runs. Just the increment to gc_num.freed that's wrong.

d-netto added a commit to RelationalAI/julia that referenced this issue May 2, 2024
d-netto added a commit to RelationalAI/julia that referenced this issue May 3, 2024
d-netto added a commit that referenced this issue May 6, 2024
Master version of #54309.

Should fix #54275.

---------

Co-authored-by: Jameson Nash <vtjnash@gmail.com>
KristofferC pushed a commit that referenced this issue May 6, 2024
Master version of #54309.

Should fix #54275.

---------

Co-authored-by: Jameson Nash <vtjnash@gmail.com>
(cherry picked from commit 92ccc74)
DelveCI pushed a commit to RelationalAI/julia that referenced this issue May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants