tersesystems.com Open in urlscan Pro
2606:4700:3033::6815:3c5b  Public Scan

Submitted URL: https://tersesystems.com/2015/11/08/closing-the-open-door-of-java-object-serialization/
Effective URL: https://tersesystems.com/blog/2015/11/08/closing-the-open-door-of-java-object-serialization/
Submission: On January 18 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

Terse Systems




CATEGORIES

Industry
Life
Logging
Security
Software




CLOSING THE OPEN DOOR OF JAVA OBJECT SERIALIZATION

08 Nov 2015 • security


TL;DR

This is a long blog post, so please read carefully and all the way through
before you come up with objections as to why it's not so serious. Here's the
short version.

Java Serialization is insecure, and is deeply intertwingled into Java monitoring
(JMX) and remoting (RMI). The assumption was that placing JMX/RMI servers behind
a firewall was sufficient protection, but attackers use a technique known as
pivoting or island hopping to compromise a host and send attacks through an
established and trusted channel. SSL/TLS is not a protection against pivoting.

This means that if a compromised host can send a serialized object to your JVM,
your JVM could also be compromised, or at least suffer a denial of service
attack. And because serialization is so intertwingled with Java, you may be
using serialization without realizing it, in an underlying library that you
cannot modify.

To combat an attacker who has penetrated or bypassed initial layers of security,
you need a technique called defense in depth.

Ideally, you should disable serialization completely using a JVM agent called
notsoserial. This will give you a security bulkhead and you can add network
monitoring to see if an attacker starts testing ports with serialized objects.

If you can't disable serialization, then there are options for limiting your
exposure until you can remove those dependencies. Please talk to your developers
and vendors about using a different serialization format.


THE EXPLOIT

If you can communicate with a JVM using Java object serialization using
java.io.ObjectInputStream, then you can send a class (technically bytes that
cause instantiation of a class already on the classpath) that can execute
commands against the OS from inside of the readObject method, and thereby get
shell access. Once you have shell access, you can modify the Java server however
you feel like.

This is a class of exploit called "deserialization of untrusted data", aka
CWE-502. It's a class of bug that has been encountered from Python, PHP, and
from Rails.

Chris Frohoff and Gabriel Lawrence presented a talk called Marshalling Pickles
that talked about some exploits that are possible once you have access to Java
object serialization.


PRACTICAL ATTACKS

A blog post by FoxGlove Security took the Marshalling Pickles talk and pointed
out that it's common for application servers to run ports with either RMI, or
JMX, a management protocol that runs on top of RMI. An attacker with access to
those ports could compromise the JVM.

The proposed fix was to identify all app servers containing commons-collections
JAR and remove them.

The problem is, you don't need to have commons-collection running – there's a
number of different pathways in. The ysoserial tool shows four different ways
into the JVM using object serialization, and that's only with the known
libraries.

There are any number of Java libraries which could have viable exploits. It
isn't over.

Matthias Kaiser of Code White is doing further research in Exploiting Java
Serialization, and says that more exploits are coming.

So, fixing this particular exploit doesn't fix the real problem, nor does it
explain why it exists.


THE REAL PROBLEM

The real problem is that "deserialization of untrusted input" happens
automatically, at the ObjectInputStream level, when readObject is called.

You need to check untrusted input first before deserializing it, a process
called "validation" or "recognition" if you're up on language security. But the
specification is so powerful and complex that there isn't a good way to securely
validate Java serialized objects.

Isolation – only having a management port open inside your "secure" data center
– isn't enough.

Cryptography – using message authentication or encryption – isn't enough.

Obscurity – hoping that this bug is too obscure to be used by attackers or that
your system is beneath their notice – isn't enough.

User level fixes – subclassing ObjectInputStream with a whitelist or wrapping
your code in doPrivileged blocks – aren't enough.

I'll break all of this down in detail in the following sections.

EDIT: Charles Miller provides more context.


WHY ISOLATION AND CRYPTOGRAPHY AREN'T ENOUGH

It's not a secret in the Java developer community that object serialization can
load arbitrary classes. In fact, object serialization is the part of the reason
that Java Applets are commonly disabled on browsers.

So, my first response to this exploit was "yes, this is why you don't publicly
expose your management ports to attackers on the Internet." JMX and RMI are
designed to work internally to the data center, and the assumption is that these
ports are not publicly exposed. This is a commonly held tenet of server side
applications – for example, Redis's security policy is explicitly “it’s totally
insecure to let untrusted clients access the system, please protect it from the
outside world yourself.”

To this end, developers typically assume that some combination of network
isolation, sandboxing, containers and hypervisors will create a secure data
center that will prevent attackers from gaining direct access to a port. In the
event that ports need to be exposed outside the data center, there are options
for RMI over TLS and JMX over TLS.

Rob Rodgers graciously and kindly corrected me on the problem inherent in this
approach: it assumes that the attacker only ever attacks from the outside.

Firewalls are typically effective at preventing direct attacks from the outside,
but they only cover one avenue of entrance. Rather than penetrate firewalls
directly, attackers typically circumvent them by coming from another direction.
They'll use the backdoor – connecting through unsecured laptops that manage
payroll, desktop machines, smartphones, etc. If they're remote, they'll set up
targeted phishing emails that have links that download malware. Or, if they're
local, they'll try physical solutions – using wifi pineapple to get into the
wireless network like TJ Maxx, dropping USB keys in the parking lot and so
forth.

This sounds complicated, but to an attacker, it's about as simple as attaching a
debugger to a running process and running through a breakpoint is to a
developer. It's just part of the job.

Most attacks in the data center are from a compromised host inside the firewall.
This changes the characteristics of the attacker: rather than a man in the
middle, the attacker is now a man on the edge. Once there, they can see all the
traffic to and from that machine. They could possibly poke at the DHCP server or
the DNS server and start impersonating other clients. They have all the access
to credentials that they need.

This is a class of exploit that you don't typically hear about in enterprise
circles, because the whole idea is that there's an "inside" and an "outside" and
the "inside" is secure. It's just not the case. The reality is that attackers
typically manage to compromise one machine inside the corporate firewall, and
then leverage that machine to gain access to others.

SSL/TLS doesn't help here. If you are in a coffee shop and connecting to a data
center, TLS will protect you against a "Man in the middle" attack, where the
user does not have the private keys necessary to break into the TLS session. If
you are inside the data center, then things are different.

TLS will not save you. The calls are coming from inside the house. No
combination of MAC, encryption and digital signatures can prevent a compromised
host from sending a serialized nastygram.

For the same reason, obfuscation (what is commonly meant by "security by
obscurity") is not a solution, because a compromised host will happy deobfuscate
the data.

From the point of view of the attacker, the focus is on making the attack and
attack vectors, which is why you don't typically hear the phrase "compromised
host" all that often. Brian Keefer suggest “lateral movement.” Blake Hyde
suggests "beachhead." Ben Tasker suggests "Pivot or Pivoting."

RMI has long been known to be a juicy target for hackers – it does not work well
with firewalls or NAT, which is why many companies will run HTTP proxies through
firewalls directly to RMI. There's a tutorial showing how to use Metasploit to
gain access to RMI servers. And because JMX runs on RMI, there's every reason
for operations to enable it and access it on a remote port.

You must assume that your network is insecure and that you may be talking to
compromised hosts. You must establish bulkheads, a technique known as defense in
depth to prevent the infection from spreading.


WHY OBSCURITY ISN'T ENOUGH

You may be wondering why anyone would bother attacking you with this, especially
if the attack is obscure and you don't think you're a big target. How real is
this? Does anyone actually do this?

The problem is that while there aren't that many attackers over all, they
benefit hugely from automation. Your average attacker is far more likely to be a
script kiddie than he is to be Mr. Robot. The problem is that the script kiddies
have libraries of every possible exploit, and automation frameworks that will
run through every exploit until something works. There's even a Linux
distribution known as Kali Linux, specifically written for systems penetration.

Security firms rely on "pen-testers" to audit companies for these
vulnerabilities, and the industry has already added this one. Direct Defense put
together a Burp Suite Extender called Super Serial with instructions on how to
use it – this locates all serialized objects from the server in the Burp Suite
network scanning tool.

The original exploit has already been refined: Trust Foundry has a blog post
describing a "one click exploit" and has an executable jar file on Github.

EDIT: As of December 3rd, Impervia reports 645 attacks using this vulnerability
from 503 different IPs.

So, relying on obscurity of the bug won't help: it's already been packaged and
broadcast. And relying on your company's obscurity… well, attacks are common,
cheap, and mostly automated. Maybe you'll be lucky for a while, but saying you
won't be attacked is like saying you won't get spam in your inbox. It may be
obscure to you, but it's not obscure to them.

And the end result? Well, look at Sony. Look at Target. Look at TJ Maxx. Look at
Fandango and Credit Karma. Security breaches are real, and they have
consequences.


THE FALLACY OF TRUSTED INPUT

The larger issue here is the implicit assumption that "untrusted input" implies
that there can be trusted input. Really, it should be "unvalidated input." Trust
– actual, no kidding, trust – is rare. You must trust the JVM class loader and
your initial configuration files because you can't start the application without
them, but everything after that is runtime input.

Barring actual custom hardware such as a trusted platform module or hardware
security module, all runtime input can be compromised. Files can be rewritten.
Network input can be spoofed. Anything that read in from I/O is "untrusted" in
the sense that it has not been validated.

But even a object that has gone through validation is still untrustworthy: it
could ask for things it has no right to, have faked credentials, etc. Trust or
"untrust" are besides the point – input should have to jump some hurdles before
it may be accepted for processing.


THE IDEAL VALIDATION SCENARIO

If you are working in a microservices / domain driven design context, you build
validation in an anti corruption layer around your bounded context (essentially
a fancy word for "your app up to the point it has to work with I/O"). Everything
outside the anti corruption layer is unvalidated. Everything inside the boundary
is validated. This matches up well with language security principles. Here, we
want full recognition before processing.

To this end, you use strongly typed objects called Value Objects to represent
your validated input. As an example, the input to an "address service" might be
"1 Market St, San Francisco CA 94111". This is a String – a raw, unvalidated,
out of the box type. Raw types are broken. What you want is an Address – this
means that you want the Address parser to validate and create a series of
AddressLine, City and ZipCode objects. You never want to expose a raw type like
String in your domain. You especially don't want to take raw types in your
public APIs. If all your methods only use types created from validation, then
you can limit your exposure. (Java suffers from not having value types, but
they're showing up in JDK 1.9, finally.)

This is the ideal. In theory, you can take the raw bytes of every object that
you received, parse and validate it, and only return an object after validation,
instead of having ObjectInputStream create an object out of the blue.


WHY VALIDATION IS HARD

The reality is not so fun. ObjectInputStream will let you hook into it, and
subclass it. The tough part is validating the input given that it's happening
from inside ObjectInputStream.

Full on validation… turns out to be tricky. Very tricky. Sami Koivu has a couple
of great blog posts in 2010(!) on why complex+powerful is a bad combination for
security and breaking defensive serialization – I won't go into great detail,
but the problem is that the serialization logic is so complex and powerful that
"secure validation while deserializing is extremely difficult" and (according to
Koivu) not only does the CERT guide not get it right, but even Josh Bloch
doesn't get it right.

The Look-ahead Java deserialization solution that has been frequently mentioned
suffers from this – the validation it suggests is a whitelist inside of
ObjectInputStream.resolveClass – at this point you've already taken a bite out
of the apple. Unless you are extraordinarily careful, even seemingly harmless
whitelisted classes that are used in serialized objects can be used in an
attack.


WHY WHITELISTING ISN'T ENOUGH

Whitelisting for allowed classes can prevent unknown classes from being called,
but it can't prevent pathological classes from being deserialized. Wouter
Coekaerts shows a denial of service attack using nested HashSets that would not
be caught by whitelisting. Likewise, if you whitelist java.net.URL, you can't
stop a specially constructed object doing a series of blocking network lookups
based off the hashCode method.

Nor can whitelisting stop exploitation of underlying serialization bugs. It's a
big, hairy piece of code, and there's no real way to slice off a piece of it.

You can, of course, implement your own custom parser and ignore
ObjectInputStream completely. That will work for simple data objects, and you
can throw away anything you don't recognize. That's a lot of work though, and of
course the more complex the parser, the more likely it is to have bugs.


WHY USER LEVEL SOLUTIONS AREN'T ENOUGH

You're probably thinking all is not lost.

You'll disable RMI and JMX ports, and move to JVM agent based solutions like
Tapiki or Hyperic Sigar. Failing that, you'll install Jolokia or jmxtrans and
try to limit your exposure.

You'll move to JSON (the lowest common denominator) or a faster, smaller,
language independent binary protocol that does validation and schema resolution:
something like Protocol Buffers / Capn Proto / Thrift / Avro. Or, if you're
going to stick with Java, you'll use Kryo, although in that case I suggest
adding Chill and having setRegistrationRequired turned on.

Then, you would carefully read the secure coding guidelines and follow all of
the recommendations in your code, and you'd be fine.

The problem is that no matter what you do in your own code, you can't be sure
that some library code, somewhere, isn't ignoring that completely and using a
raw ObjectInputStream anyway. Github shows 678,489 instances of
ObjectInputStream in various projects. JMX and RMI are only one vector – unless
you decompile all your libraries and check them, you don't know that one of the
libraries in your framework isn't using deserialization for a disk storage
format (as was common in 200x), or for session backup, or for a distributed
caching scheme.

Even then – even if you scan all of your code and can verify that
ObjectInputStream.readObject is never called anywhere within your codebase – you
still could be vulnerable. If you have any code that leans heavily on reflection
(which happens more often than not in Java frameworks), then you can still have
ObjectInputStream instantiated, although it will take more work to feed it the
right bytes and call readObject. And I don't know what the impact is to
sandboxed user supplied code that is supposed to run in a custom classloader,
but I bet everyone will be reviewing code for a while.

EDIT: SRC:CLR has a list of 41 libraries that reference Apache Commons
Collections and perform serialization, although they are careful to note they
have not proved untrusted deserialization in all these libraries.

EDIT: Sijmen Ruwhof discusses scanning an enterprise organisation for the
critical Java deserialization vulnerability. This is a useful way to find what
systems have serialization exposed on their ports – either look for well known
RMI / JMX ports and scan for the hexidecimal string "AC ED 00 05" or "rO0" – but
it should be noted that this still only deals with known ports and vulnerable
classes, and suggests depreciation of Java serializable objects long term.


WHY SECURITYMANAGER ISN'T ENOUGH

There is something that is supposed to manage the JVM at a system level –
SecurityManager. There are two problems with this: SecurityManager is both too
strict and too permissive.

The system security manager has to be specifically enabled with
-Djava.security.manager, and is a blunt instrument that causes lots of things to
stop working everywhere. But SecurityManager doesn't limit ObjectInputStream in
any significant way: you can limit implementations of subclasses of
ObjectInputStream, but you can't make it safe or turn it off.

You can create a custom SecurityManager with your own permissions and set it on
a thread, but you still have the same problem: any custom checks will apply to
your user level subclasses of ObjectInputStream and any libraries can use the
base ObjectInputStream with no restrictions.


HACKING THE JVM

The ideal solution is to override ObjectInputStream itself, which will fix
object serialization everywhere in the JVM.

My previous solution was to override the bootclasspath with a custom
implementation of ObjectInputStream, but this contravenes the Java binary
license. There is a precedence for using this method for fixing Sun's XML class
bugs, but there is a better way.

The better way to do this, without violating the Java binary license, is to use
Eirik Bjørsnøs's notsoserial project, a Java agent that will hook into the JVM
and prevent and/or control deserialization everywhere.


YOUR BEST OPTION

Your best option is to turn off object serialization completely, everywhere in
the JVM, for good. This means no RMI, and no JMX, but there are other options.

Use notsoserial with nothing in the whitelist.


YOUR SECOND BEST OPTION

In the event that you can't turn it off – find out what you can to squeeze the
problem down until you can turn it off completely. If you do have reasons to
have it on, then make sure that everything is logged, tracked and locked down as
much as possible.

So, use notsoserial with whitelisting, by tracing your serialization. Keep
network communication locked down with TLS client certificates, and investigate
Jolokia or jmxtrans if you can.

Also, talk to your operations team about using Burp Suite or Haka to identify
Java serialization across the network.

This might be a pain if you are using Java Mission Control / Java Flight
Recorder in production.


YOUR THIRD BEST OPTION

If you're writing or working with a library that requires object serialization
and you don't have the option to control your clients or your endpoints, then
you can work with a subclass of ObjectInputStream.

Take a look at ValidatingObjectInputStream or SerialKiller for an
ObjectInputStream replacement.


JAVA'S BEST OPTION

The best option is for Java itself to remove serialization. Unfortunately, the
last attempt JEP 154 was an April Fool's joke. There have been discussions to
add an explicit serialization API, but I don't know of any current movement.

Another route that Oracle could take is to enhance the SecurityManager to allow
more fine-grained control of serialization. This would work when the
SecurityManager was enabled, but not break any existing classes or code
otherwise.


VIDEOS!


WAIT, THERE'S MORE!

Just because you're done with Java object serialization using ObjectInputStream
doesn't mean that you're done: you can deserialize directly to classes using
XMLEncoder as well!

Fortunately, now you get why deserialization is bad, you can use the same
techniques shown here to validate or disable XMLEncoder in the same way. The
point here is that you need to think about all the inputs to your system, and
ask at each point if there is a gatekeeper. If you don't have validation built
in, you're allowing an open door into your system.

 security  java


COMMENTS

Load Comments
Please enable JavaScript to view the comments powered by Disqus.


RELATED POSTS


 * ECHOPRAXIA, A BETTER JAVA LOGGING API 02 JAN 2022


 * WHY I GO TO A VIRTUAL REALITY GYM 19 NOV 2021


 * CONDITIONAL DISTRIBUTED TRACING 28 AUG 2021