Skip to content

Segfaults on test g420stch.ijs #166

@herwinw

Description

@herwinw

It might be something weird on my machine, since I haven't seen these show up in the CI runs. My build consistently fail on this one test, regardless whether I build with GCC or Clang, regardless whether it's a debug or release build. It has been like this since my first checkout, so recent refactors are not likely to be the culprit.

$ build/jsrc/Debug/jconsole build/test/Debug/g420stch.ijs
Hello YouTube Viewers, all 22 of you!
see: tsu_notes, tsu_usage, tsu_pacman, and tsu_jd

   RUN  ddall  NB. report scripts that fail
   RECHO ddall NB. echo script names as run and final count of failures

Segmentation fault

Removing this single line from the test stops the segfault

'length error' -: ,.&.>/ etx 1 2;3 4 5

Replacing the expected error message with something else causes the test to fail, but it does prevent the segfault.

I've managed to reproduce in the console with the following code:

join=: ,.&.>/
join 1 2;3 4 5

That join is copied from jlibrary/system/util/pm.ijs, using the operation directly does not result in a segfault.

GDB shows the location of the error

0x00007ffff73c3cab in jtgaf (jt=0x446000, blockx=6) at ../jsrc/m.c:1066
1066	                jt->mfree[-PMINL + 1 + blockx].pool = AFCHAIN(z);  // remove & use the head of the free chain

(gdb) info locals
pushp = 0x6c8db0
z = 0x40
mfreeb = 259200
n = 128

Looking at jt->mfree, there is one very suspicious looking entry, that contains the 0x40 value seen in the z variable:

(gdb) print jt->mfree
$4 = {{ballo = 614592, pool = 0x921cc0}, {ballo = 259200, pool = 0x40}, {ballo = 985344, pool = 0x6c9440}, {ballo = 922112, pool = 0x5f6980}, {ballo = 977920, pool = 0x6795c0}}

PMINL is defined as 6 (jsrc/m.h), the argument blockx equals 6, so the calculation would indeed get the second entry. We've gotten this block a few lines earlier in the m.c file:

z = jt->mfree[-PMINL + 1 + blockx].pool;  // tentatively use head of free list as result - normal case, and even if
                                          // blockx is out of bounds will not segfault

Yes, that comment made me grin.

The segfault itself is caused by the AFCHAIN macro, also in m.c:

#define AFCHAIN(a) ((a)->kchain.chain)            // the chain field, when the block is not allocated

So I guess something returns a nullptr, the first 8 bytes of that something are interpreted as some data, the remainder is memory that is supposed to be usable.

Since this is happening to me with multiple compilers, and nobody else appears to be having this problem, it might be a problem with my libc or something like that.

I'll try to look into it deeper in the next few days

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions