Advice Needed on Optimizing GridDB Performance for Large-Scale IoT Data

Oliver60 · June 26, 2024, 11:15am

Hello Everyone ,

I’m looking for tips on how to maximise GridDB’s performance for a big IoT project I’m working on because I’m pretty new to using it.

Here are some background details and certain queries I have:

Background of the Project: I’m working on creating an Internet of Things system that gathers information from hundreds of sensors placed throughout a smart city. Humidity, temperature, quality of the air, and other indicators of the surroundings are all included in the data. Because the sensors transmit data frequently, a significant amount of time-series data is continuously created.

GridDB was our first choice because of its scalability and excellent performance with time-series data.

Present Configuration:

GridDB Three nodes in the cluster
Rate of Data Ingestion: About 50,000 records per sec
Data Model: Every sensor has its own time-series container in which its data is kept.

Problems and inquiries:

Performance of Data Ingestion: There are occasionally delays in the data input, especially during busy periods. What are the best methods for enhancing the performance of GridDB’s data intake in a high-throughput setting?

Cluster Configuration: Are generally any specific options in the cluster configuration that can aid in enhancing performance? For instance, network setups, memory allocation, or disc I/O optimisations?

Time-Series Containers: Considering the quantity of sensors, is it more cost-effective to store the data from each sensor in a separate time-series container or should we think about combining the data into a smaller number of containers?

Query Performance: In particular, for time-range query and aggregations, what tactics can we use to maximise query performance?

Scaling: Our cluster will need to be resized as the data volume grows. What are the most important things to think about and the best ways to scale a GridDB cluster horizontally?

Monitoring and Maintenance: Were there any methods or instruments suggested for keeping an eye on a GridDB cluster’s functionality and overall health?

I would be very grateful for any information, suggestions, or ccsp sources you could offer. I’m excited to make sure our GridDB implementation is reliable and effective by using lessons from the community’s experience.

Thank you in advance.