MemShrink’s 6th Birthday

MemShrink, it’s still a thing

Although not as active, we still have a MemShrink group at Mozilla. We’ve transitioned from an all out assault on memory usage to mostly just attempting to keep memory usage sane. I wasn’t around when things started, but when I joined there were at least seven people actively attending our MemShrink triage meetings, now we’re down to two. Some members have moved on, others have transitioned through, but really it comes down to the fact that we did a pretty good job of getting memory under control and with limited resources there were more important tasks to look at.

Fear not, we haven’t abandoned the project. We’re just in a bit of a lull. With big pushes for multiple content processes and the Quantum project I think we’re going to see the need to ramp up MemShrink again. In the meantime rest assured we’re still chugging along, just at a slower pace.

Big Ticket Items – 2014

Three years ago Nicholas Nethercote wrote a blog post celebrating MemShrink’s 3rd birthday and put together a list of important work we saw coming up. Lets see how those projects went.

Better regression detection

AWSY has moved into our testing automation system and we are now have automated regression detection through perfherder. I think we can declare victory here.

Devtools

The devtools team added a memory tab. Dan Callahan and Nick Fitzgerald put together a nice writeup of the new memory tool. There’s more work that can be done, but most of the devtools team’s focus is on performance profiling these days. It sounds like it could become a priority again next year.

GC Arena Fragmentation

Jon Coppeard did some heroic work (64 patches!) and got compacting GC landed. Initial measurements showed an 8% reduction in JS memory usage which is quite impressive. You can read more details in a blog post by Jon about [compacting garbage collection in SpiderMonkey].(https://hacks.mozilla.org/2015/07/compacting-garbage-collection-in-spidermonkey/)

Tarako

We actually shipped the 128MB phone! It never took off in it’s target market and eventually the entire FirefoxOS project was shut down, but I’m still super impressed we achieved such a feat.

Windows OOM crashes

This is an ongoing problem. We still think the push to 64-bit Windows builds will be a huge win. We have a plan to upgrade users from 32-bit to 64-bit if their system can handle it and will make 64-bit the default in Firefox 55.

In the meantime the JS engine is now smarter about requesting memory on Windows and multi-process Firefox has shipped.

We had hopes that upgrading our memory allocator would help as well, but we’ve since abandoned that effort.

Big Ticket Items – 2017

That was a nice trip down memory lane, but now we need to look forward. Let’s take a look at some of what I see as our next big ticket items.

Reduce JS memory usage and increase sharing of data across processes

The JavaScript engine is probably our biggest target coming up for reducing memory usage, particularly with multiple content processes enabled. There’s some impressive work going on to have our core JavaScript modules share a single global. Initial testing has shown some pretty big wins for this.

In general we need think about ways to share more data across processes.

Improved devtools for memory analysis

The devtools team did a great job with their initial iteration of memory profiling, but it would be great to see a more refined UI and tie in information from our cycle collector on the C++ side.

Expanded testing

I’d like to get the ATSY project automated so that we can get consistent numbers on how we fare against other browsers. This has been a boon for JavaScript performance, I can see it being a good motivator for improving memory usage as well. An updated test corpus that uses modern web features would be a big improvement. Making it easier to track the memory impact of WebExtensions would also be great.

Conclusions

We ticked off 4 out of 5 of our big ticket items. 64-bit builds on Windows by default is just around the corner so lets just go ahead and count that as 5 out of 5. I see plenty of future challenges for the MemShrink group particularly once the dust settles from enabling multiple content processes and the various Quantum projects.

Let me know if I missed any big improvements, I’m sure there are plenty!

Are we slim yet is dead, all hail are we slim yet

Aside from some pangs of nostalgia, it is with great pleasure that I announce the retirement of areweslimyet.com, the areweslimyet github project, and its associated infrastructure (a sad computer in Mountain View under dvander’s desk and a possibly less sad computer running the website that’s owned by the former maintainer).

Wait, what?

Don’t worry! Are we slim yet, aka AWSY, lives on, it’s just moved in-tree and is run within Mozilla’s automated testing infrastructure.

For equivalent graphs check out:
Explicit
RSS
Miscellaneous

You can build your own graph from Perfherder. Just choose ‘+ Add test data’, ‘awsy’ for the framework and the tests and platforms you care about.

Wait, why?

I spent a few years maintaining and updating AWSY and some folks spent a fair amount of time before me. It was an ad hoc system that had bits and pieces bolted on over time. I brought it into the modern age from using the mozmill framework over to marionette, added support for e10s, and cleaned up some old slightly busted code. I tried to reuse packages developed by Mozilla to make things a bit easier (mozdownload and friends).

This was all pretty good, but things kept breaking. We weren’t in-tree, so breaking changes to marionette, mozdownload, etc would cause failures for us and it would take a while to figure out what happened. Sometimes the hard drive filled up. Sometimes the status file would get corrupted due to a poorly timed shutdown. It just had a lot of maintenance for a project with nobody dedicated to it.

The final straw was the retirement of archive.mozilla.org for what we call tinderbox builds, builds that are done more or less per push. This completely broke AWSY back in January and we decided it was just better to give in and go in-tree.

So is this a good thing?

It is a great thing. We’ve gone from 18,000 lines of code to 1,000 lines of code. That is not a typo. We now run on linux64, win32, and win64. Mac is coming soon. We turned on e10s. We have results on mozilla-inbound, autoland, try, mozilla-central, and mozilla-beta. We’re going to have automated crash analysis soon. We were able to use the project to give the greenlight for the e10s-multi project on memory usage.

Oh and guess what? Developers can run AWSY locally via mach. That’s right, try this out:

mach awsy-test --quick

Big thanks go out to Paul Yang and Bob Clary who pulled all this together — all I did was do a quick draft of an awsy-lite implementation — they did the heavy lifting getting it in tree, integrated with task cluster, and integrated with mach.

What’s next?

Now that we’re in-tree we can easily add new tests. Imagine getting data points for running the AWSY test with a specific add-on enabled to see if it regresses memory across revisions. And anyone can do this, no crazy local setup. Just mach awsy-test.

A Rust-based XML parser for Firefox

Goal: Replace Gecko’s XML parser, libexpat, with a Rust-based XML parser

Firefox currently uses an old, trimmed down, and slightly modified version of libexpat, a library written in C, to support parsing of XML documents. These files include plain old XML on the web, XSLT documents, SVG images, XHTML documents, RDF, and our own XUL UI format. While it’s served it’s job well it has long been unmaintained and has been a source of many security vulnerabilities, a few of which I’ve had the pleasure of looking into. It’s 13,000 lines of rather hard to understand code and tracing through everything when looking into security vulnerabilities can take days at a time.

It’s time for a change. I’d like us to switch over to a Rust-based XML parser to help improve our memory safety. We’ve done this already with at least two other projects: an mp4 parser, and a url parser. This seems to fit well into that mold: a standalone component with past security issues that can be easily swapped out.

There have been suggestions adding full XML 1.0 v5 support, there’s a 6-year old proposal to rewrite our XML stack which doesn’t include replacing expat, there’s talk of the latest and greatest, but not quite fully speced, XML5. These are all interesting projects, but they’re large efforts. I’d like to see us make a reasonable change now.

What do we want?

In order to avoid scope creep and actually implement something in the short term I just want a library we can drop in that has parity with the features of libexpat that we currently use. That means:

  • A streaming, sax-like interface that generates events as we feed it a stream of data
  • Support for DTDs and external entities
  • XML 1.0 v4 (possibly v5) support
  • A UTF-16 interface. This isn’t a firm requirement; we could convert from UTF-16 -> UTF-8 -> UTF-16, but that’s clearly sub-optimal
  • As fast as expat with a low memory footprint

Why do we need UTF-16?

Short answer: That’s how our current XML parser stack works.

Slightly longer answer: In Firefox libexpat is wrapped by nsExpatDriver which implements nsITokenizer. nsITokenizer uses nsScanner which exposes the data it wraps as UTF-16 and takes in nsAString, which as you may have guessed is a wide string. It can also read in c-strings, but internally it performs a character conversion to UTF-16. On the other side all tokenized data is emitted as UTF-16 so all consumers would need to be updated as well. This extends further out, but hopefully that’s enough to explain that for a drop-in replacement it should support UTF-16.

What don’t we need?

We can drop the complexity of our parser by excluding parts of expat or more modern parsers that we don’t need. In particular:

  • Character conversion (other parts of our engine take care of this)
  • XML 1.1 and XML5 support
  • Output serialization
  • A full rewrite of our XML handling stack

What are our options?

There are three Rust-based parsers that I know of, none of which quite fit our needs:

  • xml-rs
    • StAX based, we prefer SAX
    • Doesn’t support DTD, entities
    • UTF-8 only
    • Doesn’t seem very active
  • RustyXML
    • Is SAX-like
    • Doesn’t support DTD, entities
    • Seems to only support UTF-8
    • Doesn’t seem to be actively developed
  • xml5ever
    • Used in Servo
    • Only aims to support XML5
    • Permissive about malformed XML
    • Doesn’t support DTD, entities

Where do we go from here?

My recommendation is to implement our own parser that fits the needs and use cases of Firefox specifically. I’m not saying we’d necessarily start from scratch, it’s possible we could fork one of the existing libraries or just take inspiration from a little bit of all of them, but we have rather specific requirements that need to be met.

Firefox memory usage with multiple content processes

This is a continuation of my Are They Slim Yet series, for background see my previous installment.

With Firefox’s next release, 54, we plan to enable multiple content processes — internally referred to as the e10s-multi project — by default. That means if you have e10s enabled we’ll use up to four processes to manage web content instead of just one.

My previous measurements found that four content processes are a sweet spot for both memory usage and performance. As a follow up we wanted to run the tests again to confirm my conclusions and make sure that we’re testing on what we plan to release. Additionally I was able to work around our issues testing Microsoft Edge and have included both 32-bit and 64-bit versions of Firefox on Windows; 32-bit is currently our default, 64-bit is a few releases out.

The methodology for the test is the same as previous runs, I used the atsy project to load 30 pages and measure memory usage of the various processes that each browser spawns during that time.

Without further ado, the results:

Graph of browser memory usage, Chrome uses a lot.

So we continue to see Chrome leading the pack in memory usage across the board: 2.4X the memory as Firefox 32-bit and 1.7X 64-bit on Windows. IE 11 does well, in fact it was the only one to beat Firefox. It’s successor Edge, the default browser on Windows 10, appears to be striving for Chrome level consumption. On macOS 10.12 we see Safari going the Chrome route as well.

Browsers included are the default versions of IE 11 and Edge 38 on Windows 10, Chrome Beta 59 on all platforms, Firefox Beta 54 on all platforms, and Safari Technology Preview 29 on macOS 10.12.4.

Note: For Safari I had to run the test manually, they seem to have made some changes that cause all the pages from my test to be loaded in the same content process.