-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please create CachedMemoryMeterStrategy that caches shallow sizes of classes #65
Comments
The MemoryMeter.Builder could have a flag that controls caching with a default of false. When enabled, it would select the MemoryMeterStrategy as usual and wrap it in the fairly trivial CachedMemoryMeterStrategy. The new CachedMemoryMeterStrategy only needs to override the measure method and delegate everything else to the backing strategy. |
As Jamm support multithreading you cannot use a simple |
Thanks, I tried it out in my project (using Kotlin): class CachedMemoryMeterStrategy(
val backingStrategy: MemoryMeterStrategy
) : MemoryMeterStrategy by backingStrategy {
private val cache = ConcurrentHashMap<Class<Any>, Long>()
override fun measure(entity: Any): Long {
return cache.getOrPut(entity.javaClass) {
backingStrategy.measure(entity)
}
}
} Then I created the MemoryMeter with this function: private fun createMemoryMeter(strategyType: Guess, cacheShallowSizes: Boolean): MemoryMeter {
val backingStrategy = MemoryMeterStrategies.getInstance().getStrategy(listOf(strategyType))
val strategy = when (cacheShallowSizes) {
true -> CachedMemoryMeterStrategy(backingStrategy)
else -> backingStrategy
}
return MemoryMeter(
strategy,
Filters.getClassFilters(/* ignoreKnownSingletons = */ true),
Filters.getFieldFilters(
/* ignoreKnownSingletons = */ true,
/* ignoreOuterClassReference = */ false,
/* ignoreNonStrongReferences = */ true,
),
NoopMemoryMeterListener.FACTORY,
)
} Here are some benchmarks on JDK 17 using Jamm 0.4.0 (ran each test 3 times and reported the duration in seconds):
So the instrumentation shows a tiny improvement that's close to the margin of error but the other strategies show a healthy 20% improvement. This suggests that there's some other overhead that's dominating the runtime. I suspect that performance would improve much more if each array element weren't enqueued and instead processed directly. But either way, a 20% improvement is a good start. |
Created a separate topic for measuring array elements much more efficiently: Once that's implemented, I expect the improvements from caching to become more pronounced as the array overhead is probably the main bottleneck when measuring collections. The memory saved by #66 should be greater than the cache overhead when dealing with medium-sized collections or larger. The rest can be left up to documentation to explain that when enabling caching, it would cache the shallow size of all classes encountered for the entire duration of the MemoryMeter existence so that will keep growing as new types are encountered. |
@blerer , I just remembered that the caching that I added before also included the relevant fields per class in addition to the shallow size to avoid having to re-traverse the inheritance hierarchy and to avoid re-filtering the fields each time. Thinking about it some more, I also realized that adding caching at the strategy level is not the best location for that. Setting up a temporary cache each time measureDeep is called makes it easier to cache both shallow size & fields without changing the current architecture and also addresses the concern about the memory usage growing unbounded throughout multiple usages. Since I didn't have an easy way to try this, as a hack, I temporarily copied the MemoryMeter class and added a local cache inside the measureDeep method and found that the performance was now measurably faster than all current strategies (about 20% faster than instrumentation and 35% faster than unsafe or spec). The extra memory from the temporary cache can be offset by the memory savings of handling arrays more efficiently (I could submit that separately). The impact of a temporary cache would become even larger once arrays are handled more efficiently. Since my hack is in Kotlin, I wanted to get your go-ahead before putting in the effort to create a pull request in Java. Does this sound good to you? |
@daniel-rusu Your proposal is consisting of 3 parts:
Regarding 1, I answered in the #66 ticket. Do you have a use case were Jamm performance is not good enough for you? If yes, could you describe your use case and share numbers? |
@blerer a few clarifications and then I'll include my use-case at the bottom:
My use-case:I'm using Jamm to analyze the memory efficiency of custom algorithms to see how they scale as most of these algorithms need to pre-process the data into a custom data structure. To accurately evaluate the memory efficiency of an algorithm, I need to run Monte Carlo simulations with randomized datasets over many variations of parameters that affect the shape and distribution of the dataset. The more simulations that I can run, the lower the variance of the results so I can present higher-accuracy results with lower margins of error. Since I have so many variations to consider in a limited amount of time, the performance is currently limiting the number of runs that I can perform resulting in lower accuracy due to higher variability and margin of error in the distributions. |
We use a single MemoryMeter instance for different measurements inside Apache Cassandra. So the application can run for months and due to some bug can sometime end up measuring a significant amount of classes. I am not so much worried by short memory usage those are fine for us. The issue is with more and more memory being used overtime and never released. |
Thanks, submitted PR: The cache is temporarily created for each invocation of measureDeep so that should address the issue with my initial proposal about it growing unbounded. I also included another benchmark with collections that contain somewhat average-looking data along with benchmark results. |
I worked on a JVM memory measurement library that used reflection to analyze deep object graphs and found a very significant performance improvement by simply caching shallow sizes. I don't have references but I believe it was on the order of 10 times faster for some common scenarios.
The implementation would be quite trivial by creating a new CachedMemoryMeterStrategy class which contains:
Scenarios where this is extremely beneficial:
Even if an object graph contains no repeated classes, this allows us to re-use the same MemoryMeter instance to measure a large number of similar objects so that the cache from the first measurement speeds up the measurements of subsequent objects.
The text was updated successfully, but these errors were encountered: