Conversation
| " config name=store_avro_kafka [path=JSON_FILE]\n" | ||
| " encoding=MODE" | ||
| " MODE is one of JSON or AVRO (default)." | ||
| " MODE is one of JSON or YYJSON or AVRO (default)." |
There was a problem hiding this comment.
I don't think this should be a configurable mode. yyjson is a json implementation, like libjansson, not a different encoding. It somes out as json regardless of the library used to to do the encoding.
If yyjson is going to be a required external library (and that is how this patch implements it in configure.ac), and it is a lot faster, then I would say just replace libjansson with yyjson.
On the other hand, if we want to make yyjson an optional external library, then we should have conditional compilation of the backend that implements the json mode. If only libjansson is available, compile against that. If yyjson is available, compile against that.
There was a problem hiding this comment.
That's reasonable. The existing plugin doesn't use jansson (which in our tests is actually slower than the sprintf-based code in the ldmsd_decomp.c), at least using the json encoding mode.
If you can give me some guidance about libtool library presence testing and what defines it puts in place, I can change it so that it's conditional to libyyjson being on the system. It should be pretty easy to make it a seamless replacement.
| AM_CONDITIONAL([HAVE_LIBJANSSON], [test "x$HAVE_LIBJANSSON" = xyes]) | ||
|
|
||
| AC_LIB_HAVE_LINKFLAGS([yyjson], [], [#include <yyjson.h>]) | ||
| AM_CONDITIONAL([HAVE_LIBYYJSON], [test "x$HAVE_LIBYYJSON" = xyes]) |
There was a problem hiding this comment.
You'll want to add these lines after the AM_CONDITIONAL:
AM_COND_IF([HAVE_LIBYYJSON],[
AC_DEFINE([HAVE_LIBYYJSON], [], [])
])
Then you don't need to that "DEFS +=" in the Makefile.am. You can just start using "#ifdef HAVE_LIBYYJSON".
ldms/src/ldmsd/ldmsd.h
Outdated
| */ | ||
| #endif | ||
|
|
||
| int ldmsd_row_to_json_object_jansson(ldmsd_row_t row, char **str, int *len); |
There was a problem hiding this comment.
I think this was supposed to end in "_yyjson", not "_jansson". And it should be inside the ifdef/endif.
| " config name=store_avro_kafka [path=JSON_FILE]\n" | ||
| " encoding=MODE" | ||
| " MODE is one of JSON or AVRO (default)." | ||
| " MODE is one of JSON or YYJSON or AVRO (default)." |
There was a problem hiding this comment.
This comment still says "or YYSON". Needs to be reverted back.
|
Clearing a milestone from this as we have to have a deeper discussion about where this might end up and how it will be built against (requirement vs Chris' suggestion of building against if present). |
|
Sorry, I haven't pushed this forward because there were some concerns about accepting it. I'll try to address all the comments and get it rebased. Let me know what you all decide. |
The existing JSON generation code uses sprintf and allocates quite a bit. The open source yyjson library, via limiting allocation, copying, and specialized type pretty-printing code, claims to be able to hit gigabytes- per-second rates both with encode and decode. Switching to this code in the storage plugin critical section more than doubles the per-process workload we can handle at our site.
use it to process JSON encoding in the avro_kafka plugin.
|
I think we should consider using this code in some paths. But I'm going to close this for now. If someone wants to consider it for future work, please open an issue. It is faster and scales better, but the API is more complicated and in most of our use cases, not a critical factor. |
The existing JSON generation code uses sprintf and allocates quite a bit. The open source yyjson library, via limiting allocation, copying, and specialized type pretty-printing code, claims to be able to hit gigabytes- per-second rates both with encode and decode.
Switching to this code in the storage plugin critical section more than doubles the per-process workload we can handle at our site.
This is a pretty basic version of this PR. Two bits of further work that'd probably be a good idea:
Finally, this compiles but it isn't 100% identical to the code we tested. I haven't had the opportunity to test it in our setup yet.