CF Objective Notes - The JVM is Your Friend

May 15, 2014

CF Objective Notes - The JVM Is Your Friend
Kai Koenig

The JVM architecture
wanted to build a platform independent runtime client
to do that you need a virtual machine

First JVM implementations were very simple
some limitations
Weak Java Memory Model
Memory leaks happened a lot, etc
Issues w/ concepts like final, volatile, etc
very simple non-generational memory
"mark and sweep" garbage collection

Hotspot JVM was introduced in 1.2 (current is 1.8) as an add-on. Now it's baked in.

CF 8 or 9 you're probably on Java 6
CF10 is usually Java 7
Railo 3 is Java 6
Raily 4+ is Java7
not sure if Adobe has certified CF to run on java 8

Garbage Collection –
when you create objects, JRT allocates memory for that
at some point the object is not necessary any more
in a http request cycle, that's obvious – when loading the page is done, there's no point in holding most variables in memory (unless they're part of the shared scope)
all those bits of memory that are allocated need to be cleaned up – hence garbage collection
if we didn't clean it up, sooner or later we'd get "out of memory" errors
so we clean up "Dead Objects"

GC algo's start w/ some kind of "marking" process
so it has some idea of what's in memory
GC tries to run over all the memory and tries to figure out what's not needed any more
that's "marking"

Starts with a "root set" – initial set of objects
follows along the "object graph" to see what it can reach
if something can't be reached, that means it's dead – there's no possible way that object will get used again

the root set –
references on the call stacks of the JVM threads
looks at what's currently running, which threads do I use to process requests in the JVM. Each thread has a call stack where all the "things" it uses are set up.
Takes all those references in the call stacks to look for stuff that's no longer in use.

All the objects that aren't reachable get attached to the "free list"
that's the basic "mark and sweep" algo
which can cause fragmentation in the memory when it goes to free up things
if the biggest free space you have is 8K, what happens when you create a 64K object? You could get "out of memory" errors.
Fragmentation is a big problem we need to deal with.
In general you want to avoid fragmentation.

Memory Generation
when you look at how apps use memory, you'll find there is a distribution
a whole lot of objects that are very short lived (request variables)
fewer objects that live for a bit longer
and hopefully not that many objects with a REALLY long lifespan
that's common – most stuff happens on a request and can be cleaned up after the request is over

makes sense that we need a "generational" memory model – if a lot of variables are short-lived, we need to treat them differently than variables that are meant to live longer

Young Generation
Old Generation / Tenured Generation
Permanent Generation

when you create a variable JVM doesn't know how long it's supposed to live. As you use it JVM will TRY to figure this out

Young Generation
split into 3 areas
Eden – where new objects are created
S1 and S2 space
"S" is the "survivor" spaces – where objects go when they "survive" a garbage collection

Young + Tenured + Old == what's known as the heap

Young Generation
stuff for a function
loop iterators
medium-lived objects (things you'd put in a session)

long-lived objects
singletons
machinery for your framework
thread pools

all that stuff gets created, ends up in Young Generation. YG fills up as you create new page requests, and eventually it has to be cleaned up. Because YG is needed all the time, the cleanup has to be really fast!

If an object survives a certain amount of collections in the YG the JVM wil assume the object is Medium or Long lived and move it into the Old Generation.
That process of moving an object from Young to Old Generation = "promotion" (don't really have to worry about this, it's normal behavior)
But over time, Old Generation becomes full.
When it fills up, GC happens
GC on Old Generation is slower than on Young
Because of the size of the OG
"Stop the world" generation, can be very slow
blocks everything for multiple seconds (can be as high as 3 minutes)
In a web server env, that will kill your environment (3 minutes of no response)

Why Generation Memory is good?
Lots of garbage – cleaning it up fast is worthwhile
generational memory mgmt:
YG GC often = space for new objects
Each generation focuses on "types" of objects

Permanent Generation
not "generation" itself
no objects from YG or OG will move here
stores just meta info about classes, internal JVM objects and JIT information
"there is a direct correlation b/t amt of java classes in your app and the amount of permanent Generation needed

Can't "force" an object to start in Old Generation – you have no way to change that
can tweak settings and sizes of YG, Eden etc,
If you make Eden small enough you can force things into OG quicker
or can lower the "survival threshold" (to say, 10)
but can't do it on a specific object level

Generation Strategy

Collector Selection
Selection Criteria
Since Java 5, we've had "ergonomics"
makes the JVM self-learn so it can tweak it's own behavior if necessary
can work in some cases
more straight forward if you set things explicitly

Serial collector
this is the "mark and sweep" idea

mark and copy
marking all the reachable objects
then there's a copy phase, copies all the objects into a new area, so the fragmentation goes away
b/c of the copying, it's slower
the cost comes form the reference / object shifting that happens during the copy
"Inter-generational References" – homework

Both mark/sweep and mark/copy
need exclusive access to the reference graph
causes long pause "stop the world" times

Parallel Mark and Copy
already in current flavors of CF (it's in Java 1.4.2)
distributes the marking and copying phases over multiple threads
the actual collecting is still "stop the world" but the time needed is much shorter than before when this was on 1 thread

OG collectors
many objects and low morality means MaC would be inefficient. Instead we use "Mark and Compact"
variation of Mark and Sweet with lower fragmentation
4 phases to this process
"MaCo" – full collection algo – there is no "pure OG collection"
doesn't run often, but when it runs it will take a while
-XX:+UseSerialGC
not recommended for a web server in general
instead, use "parallel"
=XX:+UseParallelOldGC
default since Java 6

CMS – Concurrent Mark Sweep
old generation GC that's doing concurrent clean up, marks things in the background as your other code runs. Then does the clean up in a separate thread so the "stop the world" parts of this are extremely short.
-XX:UseConcMarkSweepGC to turn this on
this is the GC that performs the best in MOST scenarios on a web server, because it avoids long pauses and does everything it can in the background.

New thing "garbage first" G1
New to Java 6, experimental at that time
Full support in Java 7 and 8
for BIG heaps 4 – 8 GB
Less fragmentation than Concurrent Mark Sweep
+UseG1GC


How to approach tuning the JVM

Do not trust consultants, blog posts, mailing lists, etc that say "this is the BEST setting..."
no such thing
have to look at your app
try different settings
and find out what works for you
have to try different strategies
no 2 environment settings are the same

Typical Reasons for tuning
application growth
added more ram, more CPU cores
performance issues (unresponsiveness)
JVM level error messages in log files
"stop PID" files – error messages in the JVM

JVM tuning isn't a "magic bullet"

to get good notes to see how your app runs, really need to have a day or 2 at least of up time to monitor. The first few hours will be different than after your app has been running for a long time.

Tools –
Fusion Reactor
great for performance issues on a Java or ColdFusion level

Visual VM
(comes with the JDK)

can use to see how many instances of a class are loaded, stuff like that
Free

iCMS – can be a better GC method if you have fewer cores/CPUs
but need to apply the setting then let it stay for a day or 2 at least to see how it behaves over time (like all the other settings discussed)

GC Viewer – app to look at the garbage collection locks

Process
make an assumption for load memory and GC settings
run load tests, monitor and measure results
but leave it running for at least 24 hours