Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current version of Raptor apparently causes Soprano to segfault: possible to fix? #66

Open
barracuda156 opened this issue Apr 12, 2024 · 25 comments
Assignees

Comments

@barracuda156
Copy link

@dajobe Sorry to disturb with this, this may not be a Raptor bug, however it seems at least that Raptor somehow causes this.

We are unable to build KDE4 libs now, since soprano segfaults due to unclear reason, but logs point at Raptor. Notably, the failure occurs on different macOS versions and different archs (apparently it works nowhere in fact).
Discussion is here: https://trac.macports.org/ticket/68452

This is what I see in a crash log on a PowerPC:

Process:         onto2vocabularyclass [73240]
Path:            /opt/local/bin/onto2vocabularyclass
Identifier:      onto2vocabularyclass
Version:         ??? (???)
Code Type:       PPC (Native)
Parent Process:  sh [73239]

Date/Time:       2024-04-12 16:26:02.770 +0800
OS Version:      Mac OS X 10.6 (10A190)
Report Version:  6

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x00000000c1012beb
Crashed Thread:  0

Thread 0 Crashed:
0   libraptor2.0.dylib            	0x0435746c raptor_free_uri + 48
1   libraptor2.0.dylib            	0x04359604 raptor_free_namespace + 36
2   libraptor2.0.dylib            	0x043596f0 raptor_namespaces_clear + 188
3   libraptor2.0.dylib            	0x0436decc raptor_turtle_parse_terminate + 24
4   libraptor2.0.dylib            	0x043543ec raptor_free_parser + 44
5   libsoprano_raptorparser.so    	0x043489e4 Soprano::Raptor::Parser::parseStream(QTextStream&, QUrl const&, Soprano::RdfSerialization, QString const&) const + 1064

And the issue is also confirmed on x86 (the ticket referred has full logs).

Could this be addressed? It would be really helpful.

@cooljeanius
Copy link

raptor_free_uri is defined here:

raptor/src/raptor_uri.c

Lines 475 to 505 in 72a8a2d

/**
* raptor_free_uri:
* @uri: URI to destroy
*
* Destructor - destroy a #raptor_uri object
**/
void
raptor_free_uri(raptor_uri *uri)
{
if(!uri)
return;
uri->usage--;
#if defined(RAPTOR_DEBUG) && RAPTOR_DEBUG > 1
RAPTOR_DEBUG3("URI %s usage count now %d\n", uri->string, uri->usage);
#endif
/* decrement usage, don't free if not 0 yet*/
if(uri->usage > 0) {
return;
}
/* this does not free the uri */
if(uri->world->uris_tree)
raptor_avltree_delete(uri->world->uris_tree, uri);
if(uri->string)
RAPTOR_FREE(char*, uri->string);
RAPTOR_FREE(raptor_uri, uri);
}

I see there's a check to guard against the uri pointer being null, but it looks like one of its subpointers, uri->world, could still be null, though... what happens if you add a check to ensure uri->world isn't null before dereferencing it, does that fix it?

@dajobe
Copy link
Owner

dajobe commented Apr 12, 2024

There's not enough information here to help.

My guess is that it's how raptor's functions are being called. Since raptor isn't written in a reference counted language with garbage collection, it can't guarantee use-after-free if the caller does free twice.

I would suggest building and running your app with something like valgrind or clan asan and see if there are such issues.

I have tested raptor release code that way and in other ways, such as with coverity.

cooljeanius added a commit to cooljeanius/raptor2 that referenced this issue Apr 12, 2024
check for potential null pointer dereference; see dajobe#66
(untested)
@barracuda156
Copy link
Author

@dajobe By the way, since the problem happens on Intel too, maybe you could try sudo port -v build kdelibs4 in Macports?
It should download everything prebuilt up to that point, so it will not waste time on compilation.

@barracuda156
Copy link
Author

@cooljeanius Does it solve the issue on x86 for you?
I can try on PowerPC, of course.

@cooljeanius
Copy link

@cooljeanius Does it solve the issue on x86 for you? I can try on PowerPC, of course.

You mean cooljeanius/raptor2@85fc5b7? I haven't tested it yet, which is why I haven't submitted a PR for it yet...

@kencu
Copy link

kencu commented Apr 17, 2024

I did try @cooljeanius 's patch, and it doesn't error in the raptor code but still later errored in the system.

I rebuilt raptor with clang's address sanitizers enabled, and I believe it shows a double-free is happening as suspected. I'm not 100% sure why or how to fix it at the moment. Perhaps someone familiar with how raptor works like @dajobe might see the issue?

$  DYLD_INSERT_LIBRARIES=/Library/Developer/CommandLineTools/usr/lib/clang/8.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib /opt/local/bin/onto2vocabularyclass --name TMO --encoding trig --namespace Nepomuk::Vocabulary --export-module nepomuk /opt/local/share/ontology/pimo/tmo.trig
=================================================================
==54557==ERROR: AddressSanitizer: heap-use-after-free on address 0x603000076d04 at pc 0x000111b9b49f bp 0x7fff53c78000 sp 0x7fff53c77ff8
READ of size 4 at 0x603000076d04 thread T0
    #0 0x111b9b49e in raptor_free_uri raptor_uri.c:487
    #1 0x111ba3189 in raptor_free_namespace raptor_namespace.c:688
    #2 0x111ba2c38 in raptor_namespaces_clear raptor_namespace.c:303
    #3 0x111c28dfb in raptor_turtle_parse_terminate .turtle_parser.y:1535
    #4 0x111b87883 in raptor_free_parser raptor_parse.c:500
    #5 0x111b75eec in Soprano::Raptor::Parser::parseStream(QTextStream&, QUrl const&, Soprano::RdfSerialization, QString const&) const raptorparser.cpp:271
    #6 0x111b75505 in Soprano::Raptor::Parser::parseFile(QString const&, QUrl const&, Soprano::RdfSerialization, QString const&) const raptorparser.cpp:200
    #7 0x111b7576a in non-virtual thunk to Soprano::Raptor::Parser::parseFile(QString const&, QUrl const&, Soprano::RdfSerialization, QString const&) const raptorparser.cpp:192
    #8 0x10bf8b0b1 in main onto2vocabularyclass.cpp:275
    #9 0x7fff8f6925ac in start (libdyld.dylib+0x35ac)

0x603000076d04 is located 20 bytes inside of 24-byte region [0x603000076cf0,0x603000076d08)
freed by thread T0 here:
    #0 0x10bff4db9 in wrap_free (libclang_rt.asan_osx_dynamic.dylib+0x4adb9)
    #1 0x111b9b688 in raptor_free_uri raptor_uri.c:504
    #2 0x111c2541e in yydestruct .turtle_parser.y:203
    #3 0x111c23879 in turtle_parser_parse turtle_parser.c:3178
    #4 0x111c2a8e7 in turtle_parse .turtle_parser.y:1430
    #5 0x111c29d1e in raptor_turtle_parse_chunk .turtle_parser.y:1750
    #6 0x111b897e7 in raptor_parser_parse_chunk raptor_parse.c:482
    #7 0x111b75ed8 in Soprano::Raptor::Parser::parseStream(QTextStream&, QUrl const&, Soprano::RdfSerialization, QString const&) const raptorparser.cpp:270
    #8 0x111b75505 in Soprano::Raptor::Parser::parseFile(QString const&, QUrl const&, Soprano::RdfSerialization, QString const&) const raptorparser.cpp:200
    #9 0x111b7576a in non-virtual thunk to Soprano::Raptor::Parser::parseFile(QString const&, QUrl const&, Soprano::RdfSerialization, QString const&) const raptorparser.cpp:192
    #10 0x10bf8b0b1 in main onto2vocabularyclass.cpp:275
    #11 0x7fff8f6925ac in start (libdyld.dylib+0x35ac)

previously allocated by thread T0 here:
    #0 0x10bff5157 in wrap_calloc (libclang_rt.asan_osx_dynamic.dylib+0x4b157)
    #1 0x111b99135 in raptor_new_uri_from_counted_string raptor_uri.c:150
    #2 0x111b99a08 in raptor_new_uri_relative_to_base_counted raptor_uri.c:302
    #3 0x111b99b48 in raptor_new_uri_relative_to_base raptor_uri.c:325
    #4 0x111c100d7 in turtle_lexer_lex .turtle_lexer.l:514
    #5 0x111c1b674 in turtle_parser_parse turtle_parser.c:1680
    #6 0x111c2a8e7 in turtle_parse .turtle_parser.y:1430
    #7 0x111c29d1e in raptor_turtle_parse_chunk .turtle_parser.y:1750
    #8 0x111b897e7 in raptor_parser_parse_chunk raptor_parse.c:482
    #9 0x111b75ea4 in Soprano::Raptor::Parser::parseStream(QTextStream&, QUrl const&, Soprano::RdfSerialization, QString const&) const raptorparser.cpp:267
    #10 0x111b75505 in Soprano::Raptor::Parser::parseFile(QString const&, QUrl const&, Soprano::RdfSerialization, QString const&) const raptorparser.cpp:200
    #11 0x111b7576a in non-virtual thunk to Soprano::Raptor::Parser::parseFile(QString const&, QUrl const&, Soprano::RdfSerialization, QString const&) const raptorparser.cpp:192
    #12 0x10bf8b0b1 in main onto2vocabularyclass.cpp:275
    #13 0x7fff8f6925ac in start (libdyld.dylib+0x35ac)

SUMMARY: AddressSanitizer: heap-use-after-free raptor_uri.c:487 in raptor_free_uri
Shadow bytes around the buggy address:
  0x1c060000ed50: fd fd fd fa fa fa fd fd fd fd fa fa fd fd fd fa
  0x1c060000ed60: fa fa fd fd fd fd fa fa fd fd fd fa fa fa fd fd
  0x1c060000ed70: fd fd fa fa fd fd fd fa fa fa fd fd fd fd fa fa
  0x1c060000ed80: fd fd fd fd fa fa fd fd fd fa fa fa fd fd fd fd
  0x1c060000ed90: fa fa fd fd fd fa fa fa fd fd fd fd fa fa fd fd
=>0x1c060000eda0:[fd]fa fa fa fd fd fd fd fa fa fd fd fd fa fa fa
  0x1c060000edb0: fd fd fd fd fa fa fd fd fd fa fa fa fd fd fd fd
  0x1c060000edc0: fa fa fd fd fd fa fa fa fd fd fd fd fa fa fd fd
  0x1c060000edd0: fd fa fa fa fd fd fd fd fa fa fd fd fd fa fa fa
  0x1c060000ede0: fd fd fd fd fa fa 00 00 00 fa fa fa fd fd fd fd
  0x1c060000edf0: fa fa fd fd fd fa fa fa fd fd fd fd fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==54557==ABORTING
Abort trap: 6

@kencu
Copy link

kencu commented Apr 17, 2024

This appears to be the spot in soprano where these calls emanate from:

https://invent.kde.org/unmaintained/soprano/-/blob/master/parsers/raptor/raptorparser.cpp?ref_type=heads#L271

   // if possible let raptor do the decoding
    if ( QIODevice* dev = stream.device() ) {
        QByteArray buf( bufSize, 0 );
        while ( !dev->atEnd() ) {
            qint64 r = dev->read( buf.data(), buf.size() );
            if ( r <= 0 ||
                 raptor_parser_parse_chunk( parser, ( const unsigned char* )buf.data(), r, 0 ) ) {
                // parse_chunck return failure code.
                // Call it with END=true and then free
                raptor_parser_parse_chunk(parser,0,0,/*END=*/1);
                raptor_free_parser( parser );
                if ( raptorBaseUri ) {
                    raptor_free_uri( raptorBaseUri );
                }
                return StatementIterator();
            }
        }

perhaps "parser" needs to be tested as non-NULL prior to this call?

raptor_free_parser( parser );

@kencu
Copy link

kencu commented Apr 17, 2024

testing for null didn't fix the issue, but removing the line did, right or wrong.

that then leads to this error:

$  DYLD_INSERT_LIBRARIES=/Library/Developer/CommandLineTools/usr/lib/clang/8.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib /opt/local/bin/onto2vocabularyclass --name TMO --encoding trig --namespace Nepomuk::Vocabulary --export-module nepomuk /opt/local/share/ontology/pimo/tmo.trig
Failed to parse file/opt/local/share/ontology/pimo/tmo.trig(Parsing failed (3): syntax error, unexpected end of file, expecting } (line: 75, column: -1))

which may indicate what the problem really is...

@dajobe
Copy link
Owner

dajobe commented Apr 17, 2024

If you can demonstrate this issue/crash with the lastest release build of raptor and the 'rapper' utility, then I probably can look deeper.

FWIW I only have amd64, aarch64 (linux) / arm64 (darwin), armv7l, riscv arches here to test.

@barracuda156
Copy link
Author

FWIW I only have amd64, aarch64 (linux) / arm64 (darwin), armv7l, riscv arches here to test.

The issue is present on x86_64, AFAICT.

@kencu
Copy link

kencu commented Apr 19, 2024

I don't think this issue has anything to do with raptor really -- although I guess ideally it shouldn't crash when given bogus data, that is not raptor's fault.

The main problem is most likely soprano, which is ancient, out-of-date, unsupported upstream, and horribly fails it's test suite when that is attempted.

@barracuda156
Copy link
Author

I don't think this issue has anything to do with raptor really -- although I guess ideally it shouldn't crash when given bogus data, that is not raptor's fault.

The main problem is most likely soprano, which is ancient, out-of-date, unsupported upstream, and horribly fails it's test suite when that is attempted.

@kencu But it presumably worked at some point, right? At least in a sense of KDE4 ports building and working with it (not necessarily passing its own test-suite).
What has changed to break it? If soprano has introduced some bug, we can roll back, of course, or fix related code in it.

@kencu
Copy link

kencu commented Apr 20, 2024

yes, soprano worked last year when i built kdelibs4 on Sonoma. Something broke it since then.

not raptor, though. Let's leave this man in peace.

@RJVB
Copy link

RJVB commented May 22, 2024

FWIW, the issue occurs with raptor 2.0.16 but not 2.0.15 ; maybe there's some useful information in that? Someone motivated enough could do a git bisect to find the raptor commit that introduced the breaking change.

perhaps "parser" needs to be tested as non-NULL prior to this call?
That, and it might be wise to set it to nullptr after being freed (in every location where that happens) esp. since the suspicion exists that this is a double free. It seems likely that this issue is a more-or-less long-standing latent bug in Soprano that just never triggered anything bad enough to cause a crash. I'll look into that via malloc's options to expose common memory management errors to see if that teaches me anything with older raptor versions that don't crash.

@RJVB
Copy link

RJVB commented May 22, 2024

FWIW bis: I can reproduce the crash with soprano 2.9.4 on Kubuntu 14.04 with a self-installed raptor 2.0.16 :

Starting program: /usr/bin/onto2vocabularyclass --name TMO --encoding trig --namespace Nepomuk::Vocabulary --export-module nepomuk /usr/share/ontology/pimo/tmo.trig
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Traceback (most recent call last):

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7e29d2c in raptor_avltree_delete_internal (tree=tree@entry=0xffffffffffffffff, 
    node_pp=node_pp@entry=0xffffffffffffffff, p_data=0x774200, rebalancing_p=rebalancing_p@entry=0x7fffffffd52c)
    at raptor_avltree.c:777
777       if(*node_pp == NULL) {
(gdb) bt
#0  0x00007ffff7e29d2c in raptor_avltree_delete_internal (tree=tree@entry=0xffffffffffffffff, 
    node_pp=node_pp@entry=0xffffffffffffffff, p_data=0x774200, rebalancing_p=rebalancing_p@entry=0x7fffffffd52c)
    at raptor_avltree.c:777
#1  0x00007ffff7e2c1b0 in raptor_avltree_remove (tree=tree@entry=0xffffffffffffffff, p_data=<optimised out>)
    at raptor_avltree.c:293
#2  0x00007ffff7e2c1c9 in raptor_avltree_delete (tree=0xffffffffffffffff, p_data=<optimised out>)
    at raptor_avltree.c:322
#3  0x00007ffff7e2404d in raptor_free_uri (uri=0x774200) at raptor_uri.c:500
#4  raptor_free_uri (uri=0x774200) at raptor_uri.c:482
#5  0x00007ffff7e24327 in raptor_free_namespace (ns=0x7742c0) at raptor_namespace.c:688
#6  0x00007ffff7e24393 in raptor_namespaces_clear (nstack=0x767220) at raptor_namespace.c:303
#7  0x00007ffff7e3b681 in raptor_turtle_parse_terminate (rdf_parser=<optimised out>) at ./turtle_parser.y:1535
#8  0x00007ffff7e2854c in raptor_free_parser (rdf_parser=0x767380) at raptor_parse.c:500
#9  0x00007ffff7fe94b4 in Soprano::Raptor::Parser::parseStream (this=<optimised out>, stream=..., baseUri=..., 
    serialization=<optimised out>, userSerialization=...)
    at raptorparser.cpp:271
#10 0x00007ffff7fe8f56 in Soprano::Raptor::Parser::parseFile (this=0x7240e0, filename=..., baseUri=..., 
    serialization=Soprano::SerializationTrig, userSerialization=...)
    at raptorparser.cpp:200
#11 0x00007ffff7fe90b9 in non-virtual thunk to Soprano::Raptor::Parser::parseFile(QString const&, QUrl const&, Soprano::RdfSerialization, QString const&) const (this=<optimised out>, filename=..., baseUri=..., 
    serialization=(unknown: 0x76b800), userSerialization=...)
    at raptorparser.cpp:201
#12 0x0000000000405e42 in main (argc=10, argv=<optimised out>)
    at onto2vocabularyclass.cpp:275

I'm a little surprised that neither platform gives a runtime error about doing a double free: I'm used to getting those.

@RJVB
Copy link

RJVB commented May 22, 2024

If you can demonstrate this issue/crash with the lastest release build of raptor and the 'rapper' utility, then I probably can look deeper.

Here you go (Linux AMD64, raptor2 2.0.16):

> rapper -q -i trig /usr/share/ontology/pimo/pimo.trig
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#State> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Class> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#State> <http://www.w3.org/2000/01/rdf-schema#comment> "Administrative subdivisions of a Nation that are broader than any other political subdivisions that may exist. This Class includes the states of the United States, as well as the provinces of Canada and European countries. (Definition from SUMO)." .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#State> <http://www.w3.org/2000/01/rdf-schema#label> "State" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#State> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Location> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Room> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Class> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Room> <http://www.w3.org/2000/01/rdf-schema#comment> "A properPart of a Building which is separated from the exterior of the Building and/or other Rooms of the Building by walls. Some Rooms may have a specific purpose, e.g. sleeping, bathing, cooking, entertainment, etc. (Definition from SUMO)." .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Room> <http://www.w3.org/2000/01/rdf-schema#label> "Room" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Room> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Location> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#RuleViewSpecificationInferOccurrences> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#RuleViewSpecification> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#RuleViewSpecificationInferOccurrences> <http://www.w3.org/2000/01/rdf-schema#label> "RuleViewSpecificationInferOccurrences" .
<http://www.w3.org/2003/01/geo/wgs84_pos#lat> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .
<http://www.w3.org/2003/01/geo/wgs84_pos#lat> <http://www.w3.org/2000/01/rdf-schema#range> <http://www.w3.org/2001/XMLSchema#float> .
<http://www.w3.org/2003/01/geo/wgs84_pos#lat> <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#datatypeProperty> .
<http://www.w3.org/2003/01/geo/wgs84_pos#lat> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#maxCardinality> "1" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#InferOccurrences> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#GraphView> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#InferOccurrences> <http://www.w3.org/2000/01/rdf-schema#label> "InferOccurrences" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Country> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Class> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Country> <http://www.w3.org/2000/01/rdf-schema#comment> "The territory occupied by a nation; \"he returned to the land of his birth\"; \"he visited several European countries\". (Definition from SUMO)" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Country> <http://www.w3.org/2000/01/rdf-schema#label> "Country" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Country> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Location> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasTag> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasTag> <http://www.w3.org/2000/01/rdf-schema#comment> "The subject's contents describes the object. Or the subject can be seen as belonging to the thing described by the object.  Similar semantics as skos:subject." .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasTag> <http://www.w3.org/2000/01/rdf-schema#domain> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Thing> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasTag> <http://www.w3.org/2000/01/rdf-schema#label> "has tag" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasTag> <http://www.w3.org/2000/01/rdf-schema#range> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Tag> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasTag> <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://www.semanticdesktop.org/ontologies/2007/08/15/nao#hasTag> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasTag> <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#objectProperty> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasTag> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#inverseProperty> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#isTagFor> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#tagLabel> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#tagLabel> <http://www.w3.org/2000/01/rdf-schema#comment> "The unique label of the tag. The label must be unique within the scope of one PersonalInformationModel. It is required and a subproperty of nao:prefLabel. It clarifies the use of nao:personalIdentifier by restricting the scope to tags. Semantically equivalent to skos:prefLabel, where uniqueness of labels is also recommended." .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#tagLabel> <http://www.w3.org/2000/01/rdf-schema#domain> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Tag> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#tagLabel> <http://www.w3.org/2000/01/rdf-schema#label> "tag label" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#tagLabel> <http://www.w3.org/2000/01/rdf-schema#range> <http://www.w3.org/2000/01/rdf-schema#Literal> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#tagLabel> <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://www.semanticdesktop.org/ontologies/2007/08/15/nao#prefLabel> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#tagLabel> <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://www.semanticdesktop.org/ontologies/2007/08/15/nao#personalIdentifier> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#tagLabel> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#maxCardinality> "1" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#tagLabel> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#minCardinality> "1" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasOtherRepresentation> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasOtherRepresentation> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#TransitiveProperty> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasOtherRepresentation> <http://www.w3.org/2000/01/rdf-schema#comment> "hasOtherRepresentation points from a Thing in your PIMO to a thing in an ontology that represents the same real world thing. \nThis means that the real world object O represented by an instance I1 has additional representations (as instances I2-In of different conceptualizations).\nThis means: IF (I_i represents O_j in Ontology_k) AND (I_m represents O_n in Ontology_o) THEN (O_n and O_j are the same object).\nhasOtherRepresentation is a transitive relation, but not equivalent (not symmetric nor reflexive).\n\nFor example, the URI of a  foaf:Person representation published on the web is a hasOtherRepresentation for the person. This property is inverse functional, two Things from two information models having the same hasOtherRepresentation are considered to be representations of the same entity from the real world.\n\nTODO: rename this to subjectIndicatorRef to resemble topic maps ideas?" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasOtherRepresentation> <http://www.w3.org/2000/01/rdf-schema#domain> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Thing> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasOtherRepresentation> <http://www.w3.org/2000/01/rdf-schema#label> "has other representation" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasOtherRepresentation> <http://www.w3.org/2000/01/rdf-schema#range> <http://www.w3.org/2000/01/rdf-schema#Resource> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasOtherRepresentation> <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#occurrence> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Collection> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Class> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Collection> <http://www.w3.org/2000/01/rdf-schema#comment> "A collection of Things, independent of their class. The items in the collection share a common property. Which property may be modelled explicitly or mentioned in the description of the Collection. The requirement of explicit modelling the semantic meaning of the collection is not mandatory, as collections can be created ad-hoc. Implizit modelling can be applied by the system by learning the properties. For example, a Collection of \"Coworkers\" could be defined as that all elements must be of class \"Person\" and have an attribute \"work for the same Organization as the user\". Further standards can be used to model these attributes." .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Collection> <http://www.w3.org/2000/01/rdf-schema#label> "Collection" .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Collection> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Thing> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasOtherConceptualization> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#hasOtherConceptualization> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#TransitiveProperty> .
rapper: Failed to parse file /usr/share/ontology/pimo/pimo.trig trig content
Segmentation fault
Exit 139

pimo.trig.gz

@RJVB
Copy link

RJVB commented May 22, 2024

Also, if you look at Soprano's raptor parser, there is no evident possibility to get a double free or any other kind of memory management error. It's all pretty straightforward: allocate a raptor parser, use it done or until a parsing error occurs, free the parser and return. Checking for nullptr is pointless but I added checks anyway, including for the only one that does potentially make a difference (the allocation of the raptor_world*. Nothing helps.

I did find another shared-desktop-ontologies file where soprano crashes (with a double-free-or-corruption warning when run from the debugger) but rapper doesn't crash; file attached.
rdfs.trig.gz

@kencu
Copy link

kencu commented May 22, 2024

yes, as I said above, the issue could more likely be with the file being parsed

Failed to parse file/opt/local/share/ontology/pimo/tmo.trig(Parsing failed (3): syntax error, unexpected end of file, expecting } (line: 75, column: -1))

now that you found a way to trigger the crash just with rapper, perhaps it can be sorted. Thanks for that.

@RJVB
Copy link

RJVB commented May 22, 2024 via email

@RJVB
Copy link

RJVB commented May 22, 2024

tmo.trig.gz

@kencu
Copy link

kencu commented May 22, 2024

I looked at the input last month but I could not spot the issue. Maybe you can see it? Linefeeds, something silly?

Now that it’s all internal to raptor perhaps it can be debugged more easily.

@RJVB
Copy link

RJVB commented May 22, 2024 via email

@RJVB
Copy link

RJVB commented May 22, 2024

BTW, the symptoms are a little different between Darwin and Linux:

> /opt/local/bin/rapper -i trig /opt/local/share/ontology/pimo/tmo.trig
rapper: Parsing URI file:///opt/local/share/ontology/pimo/tmo.trig with parser trig
rapper: Serializing with serializer ntriples
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#TMO_Instance_PersonInvolvementRole_Creator> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvementRole> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#TMO_Instance_PersonInvolvementRole_Creator> <http://www.w3.org/2000/01/rdf-schema#label> "TMO_Instance_PersonInvolvementRole_Creator" .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.w3.org/2000/01/rdf-schema#domain> <http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#AbilityCarrierInvolvement> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.w3.org/2000/01/rdf-schema#label> "abilityCarrierRole" .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.w3.org/2000/01/rdf-schema#range> <http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#AbilityCarrierRole> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#stateTypeRole> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#maxCardinality> "1" .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#minCardinality> "1" .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvement> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Class> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvement> <http://www.w3.org/2000/01/rdf-schema#comment> "PersonInvolvement  realizes n-ary associations to Persons which are realtedd to an task. The involvement is further characterized by an PersonTaskRole." .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvement> <http://www.w3.org/2000/01/rdf-schema#label> "PersonInvolvement" .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvement> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.w3.org/2000/01/rdf-schema#Resource> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvement> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Association> .
rapper: Failed to parse file /opt/local/share/ontology/pimo/tmo.trig trig content
rapper(81240,0x7fff7bc22310) malloc: *** error for object 0x7fbb51c052a0: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Abort

Curiously, setting MallocScribble prevents the abort:

> env MallocScribble=1 /opt/local/bin/rapper -i trig /opt/local/share/ontology/pimo/tmo.trig        rapper(81330,0x7fff7bc22310) malloc: enabling scribbling to detect mods to free blocks
rapper: Parsing URI file:///opt/local/share/ontology/pimo/tmo.trig with parser trig
rapper: Serializing with serializer ntriples
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#TMO_Instance_PersonInvolvementRole_Creator> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvementRole> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#TMO_Instance_PersonInvolvementRole_Creator> <http://www.w3.org/2000/01/rdf-schema#label> "TMO_Instance_PersonInvolvementRole_Creator" .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.w3.org/2000/01/rdf-schema#domain> <http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#AbilityCarrierInvolvement> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.w3.org/2000/01/rdf-schema#label> "abilityCarrierRole" .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.w3.org/2000/01/rdf-schema#range> <http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#AbilityCarrierRole> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.w3.org/2000/01/rdf-schema#subPropertyOf> <http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#stateTypeRole> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#maxCardinality> "1" .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#abilityCarrierRole> <http://www.semanticdesktop.org/ontologies/2007/08/15/nrl#minCardinality> "1" .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvement> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Class> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvement> <http://www.w3.org/2000/01/rdf-schema#comment> "PersonInvolvement  realizes n-ary associations to Persons which are realtedd to an task. The involvement is further characterized by an PersonTaskRole." .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvement> <http://www.w3.org/2000/01/rdf-schema#label> "PersonInvolvement" .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvement> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.w3.org/2000/01/rdf-schema#Resource> .
<http://www.semanticdesktop.org/ontologies/2008/05/20/tmo#PersonInvolvement> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Association> .
rapper: Failed to parse file /opt/local/share/ontology/pimo/tmo.trig trig content
rapper: Parsing returned 14 triples
Exit 1

Using jemalloc (via the jemalloc.sh script) has the same effect.

@dajobe
Copy link
Owner

dajobe commented May 24, 2024

I can reproduce this crash with raptor GIT head and the given example file on x86_64 and arm64

@dajobe
Copy link
Owner

dajobe commented May 24, 2024

Preliminary patch seems to be:

diff --git a/src/turtle_parser.y b/src/turtle_parser.y
index 7a59bf42..f5397210 100644
--- a/src/turtle_parser.y
+++ b/src/turtle_parser.y
@@ -231,7 +231,6 @@ graph: GRAPH_NAME_LEFT_CURLY
       if(turtle_parser->graph_name)
         raptor_free_term(turtle_parser->graph_name);
       turtle_parser->graph_name = raptor_new_term_from_uri(rdf_parser->world, $1);
-      raptor_free_uri($1);
       raptor_parser_start_graph(rdf_parser,
                                 turtle_parser->graph_name->value.uri, 1);
     }

although you'd need to re-generate the parser with bison

@dajobe dajobe self-assigned this May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants