Assistance Needed: GridDB Cluster Configuration and Performance Optimization

Ellis_V · June 18, 2025, 11:35am

Hi everyone,

I’m currently working on a project where I’m implementing GridDB for time-series data storage and processing, and I’ve hit a few roadblocks I’d really appreciate help with.

I’ve set up a 3-node GridDB cluster in a cloud environment (Ubuntu 22.04, 4vCPUs, 16GB RAM per node) and managed to get basic container and row operations working. However, I’ve noticed that performance degrades when I try to scale writes across multiple clients—especially during bulk inserts (~100k+ rows). The insert speed slows down significantly as more clients connect, even though CPU and memory usage remain within acceptable limits.

Here are a few questions I have:

Are there any recommended settings or configuration tweaks in gs_node.json or gs_cluster.json that could help improve multi-client write performance?
Is there a best practice for handling batch inserts with GridDB? I’m currently using Java API with put() inside a loop—would using multiPut() yield better performance?
How important is the network latency between nodes for cluster health and sync? My setup has ~5ms latency between nodes—should I be worried?
Is there a clear difference in performance between using fixed containers vs. time-series containers in high-volume use cases?

Any guidance, real-world benchmarks, or pointers to documentation I may have missed would be incredibly helpful. I’d also love to hear how others have handled similar scale issues with GridDB in production.

Thanks in advance for your time and insights!

israel · June 24, 2025, 3:44pm

To answer your questions directly:

the defaults are already optimized for best performance in most workloads.
Multiput can have large performance benefits. You can read more about it on this blog from the developer site: GridDB Optimization with Multi-Put and Query | GridDB: Open Source Time Series Database for IoT
~5ms is nothing to worry about. It is acceptable!
The time series containers are optimized for time series use cases. There may not be any raw performance differences, many time based queries will be faster with a time series container. You also get the unique time series queries you can make.