Skip to content

multi-dict and multi-thread tests crash with a segmentation fault on macOS when built with pcre2 #1514

@ryandesign

Description

@ryandesign

With the latest code in master, make check fails on macOS if link-grammar is built with pcre2 support. The multi-dict and multi-thread tests crash with a segmentation fault, which doesn't happen if I disable pcre2 with the --with-regexlib=c configure argument. From the crash log, multi-dict crashed here:

Thread 6 Crashed:
0   libpcre2-8.0.dylib            	       0x1029e08a7 match + 49451
1   libpcre2-8.0.dylib            	       0x1029d4324 pcre2_match_8 + 4536
2   liblink-grammar.5.dylib       	       0x102886fbf reg_match + 39 (regex-morph.c:239) [inlined]
3   liblink-grammar.5.dylib       	       0x102886fbf match_regex + 207 (regex-morph.c:428)
4   liblink-grammar.5.dylib       	       0x1028b4b06 regex_guess + 12 (tokenize.c:377) [inlined]
5   liblink-grammar.5.dylib       	       0x1028b4b06 separate_word + 886 (tokenize.c:2681)
6   liblink-grammar.5.dylib       	       0x1028b4489 separate_sentence + 1257 (tokenize.c:3090)
7   liblink-grammar.5.dylib       	       0x1028ae13a sentence_split + 74 (sentence.c:93)
8   multi-dict                    	       0x1026fd5ba parse_one_sent(char const*) + 31 (multi-dict.cc:40) [inlined]
9   multi-dict                    	       0x1026fd5ba parse_sents(int, int) + 122 (multi-dict.cc:82)
10  multi-dict                    	       0x1026fd7e0 decltype(static_cast<void (*>(fp)(static_cast<int>(fp0), static_cast<int>(fp0))) std::__1::__invoke<void (*)(int, int), int, int>(void (*&&)(int, int), int&&, int&&) + 4 (type_traits:3918) [inlined]
11  multi-dict                    	       0x1026fd7e0 void std::__1::__thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(int, int), int, int, 2ul, 3ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(int, int), int, int>&, std::__1::__tuple_indices<2ul, 3ul>) + 4 (thread:287) [inlined]
12  multi-dict                    	       0x1026fd7e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(int, int), int, int> >(void*) + 48 (thread:298)
13  libsystem_pthread.dylib       	    0x7ff800d7b4e1 _pthread_start + 125
14  libsystem_pthread.dylib       	    0x7ff800d76f6b thread_start + 15

while multi-thread crashed here:

Thread 3 Crashed:
0   libpcre2-8.0.dylib            	       0x10da218e6 match + 362
1   libpcre2-8.0.dylib            	       0x10da21324 pcre2_match_8 + 4536
2   liblink-grammar.5.dylib       	       0x10d8d3fbf reg_match + 39 (regex-morph.c:239) [inlined]
3   liblink-grammar.5.dylib       	       0x10d8d3fbf match_regex + 207 (regex-morph.c:428)
4   liblink-grammar.5.dylib       	       0x10d901b06 regex_guess + 12 (tokenize.c:377) [inlined]
5   liblink-grammar.5.dylib       	       0x10d901b06 separate_word + 886 (tokenize.c:2681)
6   liblink-grammar.5.dylib       	       0x10d901489 separate_sentence + 1257 (tokenize.c:3090)
7   liblink-grammar.5.dylib       	       0x10d8fb13a sentence_split + 74 (sentence.c:93)
8   multi-thread                  	       0x10d74a233 parse_one_sent(Dictionary_s*, Parse_Options_s*, char const*) + 51 (multi-thread.cc:34)
9   multi-thread                  	       0x10d74a042 parse_sents(Dictionary_s*, Parse_Options_s*, int, int) + 1378 (multi-thread.cc:125)
10  multi-thread                  	       0x10d74a445 decltype(static_cast<void (*>(fp)(static_cast<Dictionary_s*>(fp0), static_cast<Parse_Options_s*>(fp0), static_cast<int>(fp0), static_cast<int>(fp0))) std::__1::__invoke<void (*)(Dictionary_s*, Parse_Options_s*, int, int), Dictionary_s*, Parse_Options_s*, int, int>(void (*&&)(Dictionary_s*, Parse_Options_s*, int, int), Dictionary_s*&&, Parse_Options_s*&&, int&&, int&&) + 3 (type_traits:3918) [inlined]
11  multi-thread                  	       0x10d74a445 void std::__1::__thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(Dictionary_s*, Parse_Options_s*, int, int), Dictionary_s*, Parse_Options_s*, int, int, 2ul, 3ul, 4ul, 5ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(Dictionary_s*, Parse_Options_s*, int, int), Dictionary_s*, Parse_Options_s*, int, int>&, std::__1::__tuple_indices<2ul, 3ul, 4ul, 5ul>) + 17 (thread:287) [inlined]
12  multi-thread                  	       0x10d74a445 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(Dictionary_s*, Parse_Options_s*, int, int), Dictionary_s*, Parse_Options_s*, int, int> >(void*) + 53 (thread:298)
13  libsystem_pthread.dylib       	    0x7ff800d7b4e1 _pthread_start + 125
14  libsystem_pthread.dylib       	    0x7ff800d76f6b thread_start + 15

@ampli said in #1505 (comment) that this is because:

The current regex-morph.c PCRE2 code doesn't support using multi-threading without threads.h.

Possible solutions:

  • Document that, maybe add a warning in configure, and modify the multi-threading tests to print a warning and exit.
  • Modify the regex-morph.c PCRE2 code to use C++ threads, and modify configure.ac and link-grammar/Makefile.am accordingly.
  • Modify the regex-morph.c PCRE2 code to use Pthreads.

My brief searching suggests that C11 threads (threads.h) are not well supported and pthreads is suggested as the recommended alternative. pthreads are already used elsewhere in the code:

Maybe using a single threading library for the entire code base would be a good idea. I can't help with that, however, as I haven't written any multithreaded code before.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions