» Akademy 2014 - Come to my Profiling 101 Workshop!

Tue, 08/26/2014 - 19:08

Hello all!

I have the pleasure to attend Akademy this year again. From my past experience, I’m really looking forward to have a good time again. Lots of hacking, meeting known and unknown faces, drinking beer and socializing ahead! I also love that it’s in a (to me) new country again, and wonder what I will see of the Czech Republic and Brno!

This year, the conference schedule is a bit different from the past years. Not only do we have the usual two days packed with interesting talks and keynotes. No - this year there will also be workshops on the third day! These are more in-depth talks which hopefully teach the audience some new skills, be it QML, mobile development, testing, or … profiling :) Your’s truly has the honor to hold a one-hour Profiling 101 workshop.

I welcome all of you to attend my presentation. My plan, currently, is to do some life demoing of how I profile and optimize code. For that purpose, I just wrote a (really slow and badly written) word count test-app. I pushed the sources to kde:scratch/mwolff/akademy-2014.git. If you plan to join my workshop, I encourage you to download the sources and take a shot at optimizing it. I tried my best to write slow code this time, to leave plenty of opportunity for optimizations :) There are many low-hanging fruits in the code. I’m confident that I’ll be able to teach you some more advanced tips and tricks on how you can improve a Qt application’s performance. We’ll see in the end who can come up with the fastest version :)

During my workshop, I’ll investigate the performance of the wordcount app with various tools: On one hand this should teach you how to use the powerful existing opensource tools such as Linux perf and the valgrind suite. I will also show you Intel VTune though, as it is still unparalleled in many aspects and available free-of-charge for non-commercial usage on Linux. Then, I’ll present a few of my own tools to you, such as heaptrack. If you never heard of some of these tools, go try them out before Akademy!

I’ll see what else I’ll fit in and maybe I’ll extend my akademy-2014.git scratch repository with more examples over the next days.

Bye, hope to see you soon!

» Valgrind Highlighting on BKO

Thu, 07/04/2013 - 19:30

Hey all,

you didn’t hear anything from me since quite some time… Thing is, this is my last “regular” semester of university where I have two lab courses that are very time demanding. The year after, I’ll be spending time on my master thesis, which hopefully will allow for some more leisure time for KDE.

Anyhow, a small project which I just worked on to write some small lines of code again was to make my BKO backtrace highlighter work on Chromium. It requires the Tampermonkey extension to get more compatibility with Greasemonkey of Firefox fame, otherwise it works more or less out of the box!

After fixing some small other inconsistencies I can now say that it works fine in both, Firefox and Chromium! I furthermore took the liberty to extend its functionality a bit: You now also get highlighted Valgrind traces, i.e. generated by memcheck and other error reporting tools - yay!

highlighting of a Valgrind memcheck trace on with additional navigation helpers, using Chromium
Oh and before I forget it: See you all next week at Akademy! I’m so looking forward for the intense hacking and socializing, esp. since I missed last year! And: If you are interested in how to improve the performance of your application(s), make sure to attend my talk on Sunday: Apps On Speed

» Kate/KDevelop Sprint Vienna 2012 Take 1

Thu, 10/25/2012 - 13:37

Hello everyone!

Finally I take some time to blog again. I’m currently in Vienna for the joint KDevelop/Kate sprint together with lots of other hackers. Many thanks to Joseph for planning and partially financing this sprint! And of course as usual many thanks to the KDE e.V. and all the donors for bringing in the rest of the money required to pull something like this off!

Anyhow, considering that the sprint is running since Tuesday, I need to catch up quite a bit… Actually, I have to start even before that since I committed something quite noteworthy in KDevelop and KMail last week.

Reducing Memory Consumption
Shared Data References

I attended the recent Akonadi sprint that took place at the KDAB office in Berlin (where I work btw.). I heard that Alex Fiestas would come and show us his memory problems in KMail, which sooner or later was eating multiple GBs of memory for him. That sounded like a fun task to improve, fixing performance issues is what I love to do :) So I investigated it with Valgrind/Massif and my pmap script. After quite some time I came up with a patch to fix the memory increase, which is waiting for Stephen Kelly to review. It should be merged into master very soon™.

Now some technical background: What was the issue here? Why wasn’t it found earlier? Usually developers run e.g. KMail through Valgrind with the leak checker and fix all issues. The same was done lots of times in KMail, there where no problems reported. Why then is the memory still increasing over time? The issue is, that this is technically not a “leak”, i.e. when you close the application all memory is properly released. Instead, there was a logical error that resulted in KMail’s ETM (basically the item model for mails and stuff) push shared data items into a QHash without ever deleting them. If you close the app though, the QHash is cleared automatically and all shared data is properly freed, hence Valgrind won’t report any leaks.

How does one find such an issue then, though? Actually, this is really hard… I ran KMail through Valgrind and looked at the top allocation, which relates to the shared data item. But Massif will only show you where the data is being allocated, not where the shared data items are still referenced and thus prevent a proper deallocation. What now? GDB! Yes, the best way I found was adding a breakpoint in the copy constructor of the shared data class and looking at the surrounding code to see whether it behaves properly… Does anyone have a more efficient way to debug such issues? I could imagine that this can potentially take ages to figure out… In KMail at least I could find the problematic place quite fast.

Implicit Sharing

Now while this apparently fixes the ever increasing memory consumption of KMail somewhat, I thought we could do some more improvements. Take a look at e.g. This patch is similar to what I also did for Massif Visualizer once. By creating a central cache we can leverage Qt’s implicit sharing for common strings (in this case email e.g. adresses, domains, …). This way, if you load a folder containing e.g. a mailing list, you will have the main email address (like only once in memory. Before, that address would be loaded into memory once for every email in the folder…

Now the above was an interesting detour into a project that I don’t usually contribute to. Since I use KMail all the time though, it is just fair to give back and help out the few KDEPIM people a bit.


Back to my favorite pet project: KDevelop :) In the spirit of the memory fixes above, I took another look at the memory consumption of KDevelop. Turns out, we had a similar issue where we did not reuse implicit sharing properly. This resulted in quite some useless allocations blowing up the memory consumption (in this case, KUrl’s of every file in the projects loaded for a session). The fix is already in master. Not only should that decrease the memory consumption considerably for kdevelop sessions with many files in it. No, it actually should save quite a few instructions and thus be much faster as well. Enjoy!

So… quite a long blog post again - sorry for that :) Expect some more KDevelop news the next days - we have lots of interesting stuff happening here at the sprint! Cheers and many thanks to Joseph and the KDE e.V. again!

» Improving Massif-Visualizer For Large Data Files

Fri, 03/16/2012 - 15:42

As I just wrote in another article, Massif is an invaluable tool. The [Visualizer](] I wrote is well appreciated and widely used as far as I can see.

A few days ago though, I did a very long (~16h) Massif run on an application, which resulted in a 204MB massif.out data file. This proved to be a very good stress test for my visualizer, which triggered me to spent some time on optimizing it. The results are pretty nice I thing, so look forward to Massif-Visualizer 0.4:

Reduced Memory Consumption

Yeah, meta eh? Just how I like it! I’ve used Massif to improve the memory consumption of Massif-Visualizer, and analyzed the data in the Visualizer of course… :)

Initial Version
initial memory consumption of the visualizer
fig. 1: initial memory consumption of the visualizer

The initial version of my visualizer took about ~470MB of memory to load the 204MB data file above. 80% of that was required for QString allocations in the callgraph of each detailed snapshot, i.e. the function signatures and location. See fig. 1 for the details.

QString to QByteArray
memory consumption of the visualizer using QByteArray instead of QString
fig. 2: `QByteArray` instead of `QString`: 50% less memory

Thomas McGuire gave me the tip of using QByteArray instead, since the Massif callgraph data is just ASCII data. We can convert the data to QString where required, essentially saving us 50% of the memory consumption. You can see that applied in fig. 2. It was simple to code and already reduced the memory consumption considerably.

Implicit Sharing
memory consumption of the visualizer leveraging implicit sharing
fig. 3: leveraging implicit sharing

I committed the above, thinking this was it. But thanks to the awesome people in the KDE community, this time André Wöbbeking, I was thankfully shown wrong: He commented on my commit, arguing that I should try out to leverage the implicit sharing of Qt containers, such as QByteArray. After all, the strings we have here are function signatures and file locations, which are repeated quite often. Especially when you have recursion in your call tree, or the same functions are encountered again and again in Massif snapshots, you can potentially safe a lot of memory by leveraging implicit sharing.

Personally, I’m suprised to see just how much this gains in this case! See fig 3., where the string allocations are nearly gone completely from the Massif log! Now only the tree node allocations, and the containers saving them, are visible in the memory log - something I do not plan to reduce further.

If you are interested in how this was implemented, take a look at commit 4be5dad13fb.

Final Notes

I think this shows quite nicely how to improve the memory consumption of an application. If you want to verify my results, I’ve uploaded the massif log files. Remember that you can open compressed files seamlessly in Massif-Visualizer. The file contains the test-data of the 16h Massif run.

You should probably use the latest Massif-Visualizer code though, since I’ve also optimized the performance of it considerably compared to the last released version 0.3. Furthermore, data files are now loaded in the background, showing a nice progress bar while doing that. If you open the big data file in 0.3 you’ll notice why I decided to optimize the visualizer :)

An interesting thing to note btw. is that the callgrind data format compresses files and function signatures, yielding much smaller data files and reducing the KCacheGrind’s memory consumption, esp. since it will automagically leverage the implicit sharing of Qt’s string classes.

Now it is probably time to stop slacking and start work-work again :) I do have quite a few ideas more for the next Massif-Visualizer though, especially an export functionality for the graphs is high on my TODO list!

» Massif Visualizer 0.3 released

Sun, 11/20/2011 - 19:17

Hey all!

I’m happy to announce the release of Massif-Visualizer 0.3. You can download the sources here:

Highlights of this release:

  • translations into 18 different languages
  • basic support for hiding of functions via context menu
  • basic support for custom allocators
  • configurable precision of memory consumption display
  • various optimizations, bug fixes and other improvements. take a look at the changelog for more information
Future Development

It took me much too long to get this release out and hope to do better in the future. Current git master already contains some new patches - try it out! I especially like the improved display of the callgraph which now aggregates the tails of the callgraph tree, i.e. the end of the backtrace which mostly starts main() etc.

If you have ideas for the future, or have any issues with Massif-Visualizer, report a bug on .


I don’t want to waste my time on packaging and instead concentrate on development. So here again a public plead to distributors: Please include Massif-Visualizer in future releases and let users download it seamlessly through your package manager. Starting with this release, Massif-Visualizer releases will follow the usual KDE procedure and the source tarballs will be hosted on the KDE mirrors. Hope this makes it simpler for packagers!

Ubuntu users can use Aurélien Gâteau’s PPA: . ArchLinux users you can get it via AUR, and Gentoo also has an overlay for it it seems. I also think that OpenSUSE has a package for it. Please add infos for $your-distro in the comments below.


» Should all callgrind bottlenecks be optimized?

Thu, 12/09/2010 - 19:12

Hey all,

I’d like to have some feedback from you. Consider this code:

  1. #include <iostream>
  2. #include <memory.h>
  4. using namespace std;
  6. struct List {
  7. List(int size) {
  8. begin = new int[size];
  9. memset(begin, 0, size);
  10. end = begin + size;
  11. }
  12. ~List() {
  13. delete[] begin;
  14. }
  15. int at(int i) const {
  16. return begin[i];
  17. }
  18. int size() const {
  19. // std::cout << "size called" << std::endl;
  20. return end - begin;
  21. }
  22. int& operator[](int i) {
  23. return begin[i];
  24. }
  26. private:
  27. int* begin;
  28. int* end;
  29. };
  31. int main() {
  32. const int s = 1000000;
  33. for (int reps = 0; reps < 1000; ++reps) {
  34. List l(s);
  35. List l2(s);
  36. // version 1
  37. for ( int i = 0; i < l.size(); ++i ) {
  38. // version 2
  39. // for ( int i = 0, c = l.size(); i < c; ++i ) {
  40. l2[i] =;;
  41. }
  42. }
  43. return 0;
  44. }

If you run this through callgrind, you’ll see quite some time being spent in l.size(), the compiler doesn’t seem to optimize that away. Now, fixing this “bottleneck” is simple, look at version 2. That way, l.size() will only be called once and you’ll save quite some instructions according to callgrind.

Now, my first impression was: Yes, lets fix this! On the other hand, this optimization is not really that noticable in terms of user-experience. So my question is: Is it worth it? Should everything one sees in callgrind that is easily avoidable and optimizable (like the stuff above) be optimized?

I ask because QTextEngine e.g. doesn’t use the optimized version and I wonder whether I should create a merge request for that. According to callgrind the difference is noticeable: One of my testcases shows ~8% of the time being spent in QVector<QScriptItem>::size() (via QTextEngine::setBoundary()). In Kate the difference is even bigger with ~16% of the time being spent in QList<QTextLayout:.FormatRange>::size() via QTextEngine::format(). Hence I’d say: yes, lets optimize that. I just wonder whether it’s noticeably in the end.


» Massif Visualizer 0.2 released

Sun, 11/07/2010 - 00:16

Hey all!

I’m happy to release Massif Visualizer v0.2. This is mainly a “fix the build-system” release, no new features have been added.

You can download it here:

Mac Support

Thanks to the reports by Chris Jones it’s now possible to build and use Massif Visualizer on Max OS X, see e.g.:

He has also submitted the portsfile for inclusion in Macports:

KGraphViewer now optional

I’ve made the KGraphViewer dependency optional, if anyone does not want it (even though this removes like 50% of the tools features).

KDE Infrastructure

I’ve also prepared the steps for moving Massif-Visualizer into KDE Extragear and asked kde-devel for review. I already use the KDE infrastructure now:

git clone git://
Bug tracker:
Mailing List:

This also means that I’ll shortly get translations by the awesome KDE-i18n-Team, so stay tuned for a 0.3 including translations!

Open Suse Buildservice

I’ve also spent quite some time today battling with OBS and can provide at least packages for Fedora, Mandriva and Open Suse now. I’m still waiting for help on the remaining issues and once they are resolved I’ll add the remaining packages.

  1. ChangeLog for massif-visualizer v0.2
  2. ====================================
  4. * Milian Wolff: set version to 0.2
  5. * Milian Wolff: fix conditional
  6. * Milian Wolff: make kgraphviewer dependency optional
  7. * Milian Wolff: fix FindKGraphViewer.cmake
  8. * Milian Wolff: fix .po name
  9. * Milian Wolff: remove some esoteric cli option for XGETTEXT that does not make
  10. any sense according to Albert
  11. * Milian Wolff: fix: install libs to make sure they can get loaded on OSX e.g.
  12. * Milian Wolff: fix compile warning about init order, improve style by having
  13. just one init per line
  14. * Milian Wolff: add export macros everywhere, make visualizer helper use the
  15. Massif namespace as well
  16. * Milian Wolff: add

» Massif Visualizer 0.1 released

Tue, 11/02/2010 - 19:06

Good news everyone!

Since Gaël finally came around to release KGraphViewer 2.1, I can go ahead and do the same for Massif Visualizer!

Download Massif Visualizer 0.1

This is the first release and I would be very happy if more users gave me their feedback. I intend to move to soon in order to leverage the KDE infrastructure (mostly translations, bug tracker, releases)… This also means: There are no translations yet! I also intend to update my OBS repository to provide packages for the first release.

Stay tuned for updates.

» Massif Visualizer - now with user interaction

Sat, 03/13/2010 - 16:55

Just a quick status update: Massif Visualizer now reacts on user input. Meaning: You can click on the graph and the corresponding item in the treeview gets selected and vice versa. It’s a bit buggy since KDChart is not reliable on what it reports, but it works quite well already.

Furthermore the colors should be better now, peaks are labeled (better readable on bright color schemes, I’m afraid to say…), legend is shown, …

Now lets see how I can make the treeview more useful!