Should I set the number of partitions based on my memory size? Can I update it in the future? According to your documentation, the number of partitions falls between 10 and 100, what’s the difference?
Mainly depends on the number of nodes. No need to set the number too big for each node.
The bigger the number of partitions, the smaller the data granularity and the easier the data can be hit. However, the more memory it consumes. That said, we suggest that you set the number of partitions based on both the number of nodes in the cluster and your memory size.
10 is enough for a 3-node cluster
Is there a general suggestion in this regard? For example, how many partitions should be set for xxx memory size, xxx cached data with xxx as the cache hit ratio.
Unfortunately there’s no such general advice as “standard” because it’s a case-by-case scenario. Usually we recommend that you set the number of partition betwen 10 and 100. And 10 is enough for most single-host deployments.
The number of partition doesn’t affect Cache Hit Ratio. Memory size is the factor that moves the needle.
It is usually 5 * hard disks number in the cluster.
It depends on your storage disk number and type. If HDD, disk num * machine num would be ok, SSD could be disk num * machine num * 3.