memory Syndicate content

warning: Creating default object from empty value in /www/htdocs/w0065fc9/milianw/modules/taxonomy/taxonomy.pages.inc on line 33.

» Heaptrack 1.0.0

Tue, 02/28/2017 - 15:18

Hey all :)

I’ve finally managed to release heaptrack properly! The first stable release, i.e. v1.0.0 is available for download: https://download.kde.org/stable/heaptrack/1.0.0/src/

You can find more information on the official release announcement over on the KDAB page: https://www.kdab.com/heaptrack-v1-0-0-release/

If you want to read more about what heaptrack is, check out the README.md or have a look at the initial announcement of heaptrack, now three years old!

Cheers, happy profiling!

» Heaptrack - A Heap Memory Profiler for Linux

Tue, 12/02/2014 - 19:25

Hello everyone,

with a tingly feeling in my belly, I’m happy to announce heaptrack, a heap memory profiler for Linux. Over the last couple of months I’ve worked on this new tool in my free time. What started as a “what if” experiment quickly became such a promising tool that I couldn’t stop working on it, at the cost of neglecting my physics masters thesis (who needs that anyways, eh?). In the following, I’ll show you how to use this tool, and why you should start using it.

» Kate/KDevelop Sprint Vienna 2012 Take 1

Thu, 10/25/2012 - 13:37

Hello everyone!

Finally I take some time to blog again. I’m currently in Vienna for the joint KDevelop/Kate sprint together with lots of other hackers. Many thanks to Joseph for planning and partially financing this sprint! And of course as usual many thanks to the KDE e.V. and all the donors for bringing in the rest of the money required to pull something like this off!

Anyhow, considering that the sprint is running since Tuesday, I need to catch up quite a bit… Actually, I have to start even before that since I committed something quite noteworthy in KDevelop and KMail last week.

Reducing Memory Consumption
KMail
Shared Data References

I attended the recent Akonadi sprint that took place at the KDAB office in Berlin (where I work btw.). I heard that Alex Fiestas would come and show us his memory problems in KMail, which sooner or later was eating multiple GBs of memory for him. That sounded like a fun task to improve, fixing performance issues is what I love to do :) So I investigated it with Valgrind/Massif and my pmap script. After quite some time I came up with a patch to fix the memory increase, which is waiting for Stephen Kelly to review. It should be merged into master very soon™.

Now some technical background: What was the issue here? Why wasn’t it found earlier? Usually developers run e.g. KMail through Valgrind with the leak checker and fix all issues. The same was done lots of times in KMail, there where no problems reported. Why then is the memory still increasing over time? The issue is, that this is technically not a “leak”, i.e. when you close the application all memory is properly released. Instead, there was a logical error that resulted in KMail’s ETM (basically the item model for mails and stuff) push shared data items into a QHash without ever deleting them. If you close the app though, the QHash is cleared automatically and all shared data is properly freed, hence Valgrind won’t report any leaks.

How does one find such an issue then, though? Actually, this is really hard… I ran KMail through Valgrind and looked at the top allocation, which relates to the shared data item. But Massif will only show you where the data is being allocated, not where the shared data items are still referenced and thus prevent a proper deallocation. What now? GDB! Yes, the best way I found was adding a breakpoint in the copy constructor of the shared data class and looking at the surrounding code to see whether it behaves properly… Does anyone have a more efficient way to debug such issues? I could imagine that this can potentially take ages to figure out… In KMail at least I could find the problematic place quite fast.

Implicit Sharing

Now while this apparently fixes the ever increasing memory consumption of KMail somewhat, I thought we could do some more improvements. Take a look at https://git.reviewboard.kde.org/r/106836/ e.g. This patch is similar to what I also did for Massif Visualizer once. By creating a central cache we can leverage Qt’s implicit sharing for common strings (in this case email e.g. adresses, domains, …). This way, if you load a folder containing e.g. a mailing list, you will have the main email address (like list@domain.org) only once in memory. Before, that address would be loaded into memory once for every email in the folder…

Now the above was an interesting detour into a project that I don’t usually contribute to. Since I use KMail all the time though, it is just fair to give back and help out the few KDEPIM people a bit.

KDevelop

Back to my favorite pet project: KDevelop :) In the spirit of the memory fixes above, I took another look at the memory consumption of KDevelop. Turns out, we had a similar issue where we did not reuse implicit sharing properly. This resulted in quite some useless allocations blowing up the memory consumption (in this case, KUrl’s of every file in the projects loaded for a session). The fix is already in master. Not only should that decrease the memory consumption considerably for kdevelop sessions with many files in it. No, it actually should save quite a few instructions and thus be much faster as well. Enjoy!

So… quite a long blog post again - sorry for that :) Expect some more KDevelop news the next days - we have lots of interesting stuff happening here at the sprint! Cheers and many thanks to Joseph and the KDE e.V. again!

» Improving Massif-Visualizer For Large Data Files

Fri, 03/16/2012 - 15:42

As I just wrote in another article, Massif is an invaluable tool. The [Visualizer](https://projects.kde.org/massif-visualizer] I wrote is well appreciated and widely used as far as I can see.

A few days ago though, I did a very long (~16h) Massif run on an application, which resulted in a 204MB massif.out data file. This proved to be a very good stress test for my visualizer, which triggered me to spent some time on optimizing it. The results are pretty nice I thing, so look forward to Massif-Visualizer 0.4:

Reduced Memory Consumption

Yeah, meta eh? Just how I like it! I’ve used Massif to improve the memory consumption of Massif-Visualizer, and analyzed the data in the Visualizer of course… :)

Initial Version
initial memory consumption of the visualizer
fig. 1: initial memory consumption of the visualizer

The initial version of my visualizer took about ~470MB of memory to load the 204MB data file above. 80% of that was required for QString allocations in the callgraph of each detailed snapshot, i.e. the function signatures and location. See fig. 1 for the details.

QString to QByteArray
memory consumption of the visualizer using QByteArray instead of QString
fig. 2: `QByteArray` instead of `QString`: 50% less memory

Thomas McGuire gave me the tip of using QByteArray instead, since the Massif callgraph data is just ASCII data. We can convert the data to QString where required, essentially saving us 50% of the memory consumption. You can see that applied in fig. 2. It was simple to code and already reduced the memory consumption considerably.

Implicit Sharing
memory consumption of the visualizer leveraging implicit sharing
fig. 3: leveraging implicit sharing

I committed the above, thinking this was it. But thanks to the awesome people in the KDE community, this time André Wöbbeking, I was thankfully shown wrong: He commented on my commit, arguing that I should try out to leverage the implicit sharing of Qt containers, such as QByteArray. After all, the strings we have here are function signatures and file locations, which are repeated quite often. Especially when you have recursion in your call tree, or the same functions are encountered again and again in Massif snapshots, you can potentially safe a lot of memory by leveraging implicit sharing.

Personally, I’m suprised to see just how much this gains in this case! See fig 3., where the string allocations are nearly gone completely from the Massif log! Now only the tree node allocations, and the containers saving them, are visible in the memory log - something I do not plan to reduce further.

If you are interested in how this was implemented, take a look at commit 4be5dad13fb.

Final Notes

I think this shows quite nicely how to improve the memory consumption of an application. If you want to verify my results, I’ve uploaded the massif log files. Remember that you can open compressed files seamlessly in Massif-Visualizer. The massif.out.data.bz2 file contains the test-data of the 16h Massif run.

You should probably use the latest Massif-Visualizer code though, since I’ve also optimized the performance of it considerably compared to the last released version 0.3. Furthermore, data files are now loaded in the background, showing a nice progress bar while doing that. If you open the big data file in 0.3 you’ll notice why I decided to optimize the visualizer :)

An interesting thing to note btw. is that the callgrind data format compresses files and function signatures, yielding much smaller data files and reducing the KCacheGrind’s memory consumption, esp. since it will automagically leverage the implicit sharing of Qt’s string classes.

Now it is probably time to stop slacking and start work-work again :) I do have quite a few ideas more for the next Massif-Visualizer though, especially an export functionality for the graphs is high on my TODO list!

» Tracking Memory Consumption Using Pmap

Fri, 03/16/2012 - 14:55

Massif is a really nifty tool which is very powerful, especially paired with my visualizer. The caveat of course is that it slows down the application considerably, I’ve seen anything up to a factor of 100… I see no alternative to Massif when it comes to investigating where your memory problems come from. But if you just want to see whether you have a problem at all, tracking the total memory consumption should suffice.

A few days ago, I came across pmap on Stack Overflow, which makes it easy to track the RSS memory consumption of an application using the -x switch. Of course I had to write some bash magic to automate this process and visualize the data using Gnuplot! Behold:

memory consumption of PhantomJS
memory consumption of a PhantomJS script over ~30min
usage

It’s simple, really: track_memory.sh $(pidof myapp).

The default timeout is ~1s between snapshots, you can pass a different timeout as second parameter. Bash’s sleep can also take float numbers such as 0.1 to get more snapshots for fast-running apps.

You can also run show_memory.sh mem.log.$(pidof myapp) while you are still tracking the memory. The gnuplot window that appears allows you to update the data intermittently, to zoom in and to create images such as the above.

Note: This kind of memory usage tracking costs nearly nothing, your application continues to work at full speed. Also be aware that this just shows the RSS memory consumption. Massif will always give you better, more detailed and accurate results. Still, I think this should already give you an idea on how your application behaves. If the graph goes up and up, you probably got a memory leak! Then it’s time to run Memcheck and/or Massif to find the issue and fix it!

track_memory.sh

You can find the most-recent version on GitHub: https://github.com/milianw/shell-helpers/blob/master/track_memory.sh

  1. #!/bin/bash
  2.  
  3. #
  4. # track memory of given application, identified by PID,
  5. # using pmap -x, to show RSS and Dirty memory usage.
  6. #
  7. # visualization can later on be done with the
  8. # show_memory.sh script.
  9. #
  10.  
  11. pid=$1
  12. sleep=$2;
  13.  
  14. if [[ "$sleep" == "" ]]; then
  15. sleep=1
  16. fi
  17.  
  18. if [[ "$(ps -p $pid | grep $pid)" == "" ]]; then
  19. echo "cannot find program with pid $pid"
  20. echo "track_memory.sh PID [SLEEP_TIMEOUT]"
  21. echo
  22. echo "example: track_memory.sh \$(pidof someapp) 0.1"
  23. exit
  24. fi
  25.  
  26. logfile=mem.log.$pid
  27.  
  28. echo "# $(ps -o command= -p $pid)" > $logfile
  29. echo "# $sleep" >> $logfile
  30.  
  31. cat $logfile
  32.  
  33. while [[ "$(ps -p $pid | grep $pid)" != "" ]]; do
  34. echo "snapshot " $pid
  35. pmap -x $pid | tail -n1 >> $logfile
  36. echo "$sleep"
  37. sleep $sleep;
  38. done
  39.  
  40. echo "done tracking, visualizing"
  41. $(dirname $0)/show_memory.sh "$logfile"
show_memory.sh

You can find the most-recent version on GitHub: https://github.com/milianw/shell-helpers/blob/master/show_memory.sh

  1. #!/bin/bash
  2.  
  3. #
  4. # visualize memory consumption over time
  5. # as recorded by pmap / track_memory.sh
  6. # script
  7. #
  8.  
  9. logfile=$1
  10.  
  11. if [ ! -f "$logfile" ]; then
  12. echo "cannot find memory logfile: $1"
  13. echo
  14. echo "usage: show_memory.sh LOGFILE"
  15. echo
  16. echo "example: show_memory.sh mem.log.12345"
  17. exit
  18. fi
  19.  
  20. title=$(head -n1 "$logfile")
  21. timeout=$(head -n2 "$logfile" | tail -n1)
  22.  
  23. title=${title/\# /}
  24. timeout=${timeout/\# /}
  25.  
  26. # total:
  27. # '$logfile' using 3 w lines title 'Kbytes', \
  28.  
  29. gnuplot -p -e "
  30. set title '$title';
  31. set xlabel 'snapshot ~${timeout}s';
  32. set ylabel 'memory consumption in kB';
  33. set key bottom right;
  34. plot \
  35. '$logfile' using 4 w lines title 'RSS' lt 1, \
  36. '$logfile' using 4 smooth bezier w lines title 'RSS (smooth)' lt 7, \
  37. '$logfile' using 5 w lines title 'Dirty' lt 2, \
  38. '$logfile' using 5 smooth bezier w lines title 'Dirty (smooth)' lt 3;
  39. ";
Future

The above is nice, but I’m wondering on whether one should not add this kind of utility to ksysguard: It already allows you to track the total memory consumption of your system, yet I did not find a way to just track a single application and visualize it’s memory consumption.