www.ubiqx.org Open in urlscan Pro
194.164.181.140  Public Scan

URL: http://www.ubiqx.org/cifs/SMB.html
Submission: On May 05 via api from US — Scanned from GB

Form analysis 0 forms found in the DOM

Text Content

2. SMB: THE SERVER MESSAGE BLOCK PROTOCOL



--------------------------------------------------------------------------------



 



  


2.1 A LITTLE BACKGROUND ON SMB



email

--------------------------------------------------------------------------------

From: Steven French, Senior Software Engineer, IBM To: Chris Hertel

Chris,

Hope things are going well in the cold north ...

I thought the following info would be interesting to you. I met the original
"inventor" of SMB a few years ago - Dr. Barry Feigenbaum - who back in the early
80's was working on network software architecture for the infant IBM PCs,
working for IBM in the Boca Raton plant in Florida. He mentioned that it was
first called the "BAF" protocol (after his initials) but he later changed it to
SMB. In the early DOS years IBM and Microsoft (with some input from Intel and
3Com) contributed to it but by the time of the first OS/2 server version
(LANMAN1.0 dialect and later) Microsoft did much of the work (for "LAN Manager"
and its relatives).
 



Like NetBIOS, the Server Message Block protocol originated a long time ago at
IBM. Microsoft embraced it, extended it, and in 1996 gave it a marketing upgrade
by renaming it "CIFS".

Over the years there have been several attempts to document and standardize the
SMB/CIFS protocol:
 

Change is the essential process
of all existence.
-- Spock (Leonard Nimoy)
Let That Be Your Last
Battlefield,
stardate 5730.2   
 * Microsoft keeps an archive of documentation covering older versions of
   SMB/CIFS. The collection spans a period of roughly ten years, starting at
   about 1988 with the SMB Core Protocol. The collection is housed, it seems, on
   a dusty FTP server in a forgotten corner of a machine room somewhere in the
   Pacific Northwest. The URL for the CIFS archive is
   ftp://ftp.microsoft.com/developr/drg/CIFS/.



 * In 1992, X/Open (now known as The Open Group) published an SMB specification
   titled Protocols for X/Open PC Interworking: SMB, Version 2. The book is now
   many years out of date and SMB has evolved a bit since its publication, yet
   it is still considered one of the best references available1. The Open Group
   is a standards body so the outdated version of SMB described in the X/Open
   book is, after all, a standard protocol.



 * A few years later, Microsoft submitted a set of CIFS Internet Drafts to the
   IETF (Internet Engineering Task Force), but those drafts were somewhat
   incomplete and inaccurate and they were allowed to expire. Microsoft's more
   recent attempts at documenting CIFS (starting in March, 2002) have been
   rendered useless by awkward licensing restrictions, and from all accounts
   contain no new information2. The expired IETF Internet Drafts (by Paul Leach
   and Dilip Naik) are still available from the Microsoft FTP server described
   above and other sources around the web.



 * The CIFS Working Group of the Storage Network Industry Association (SNIA) has
   published a CIFS Technical Reference based on the earlier IETF drafts. The
   SNIA document is neither a specification nor a standard, but it is freely
   available from the SNIA website.

Without a current and authoritative protocol specification, there is no external
reference against which to measure the "correctness" of an implementation, and
no way to hold anyone accountable. Since Microsoft is the market leader, with a
proven monopoly on the desktop, the behavior of their clients and servers is the
standard against which all other implementations are measured.
 

You knew the job was
dangerous when you took it.
-- Super Chicken
Jay Ward and Bill Scott,
ABC TV, 1967-1968   

Jeremy Allison, the Samba Team's First Officer3, has stated that "The level of
detail required to interoperate successfully is simply not documentable". One
reason that this is true is that Microsoft can "enhance" SMB behavior at will.
Combined with the dearth of authoritative references, this means that the only
criteria for a well-behaved SMB implementation is that it works with Microsoft
products. As a result, subtle inconsistencies and variations have crept into the
protocol. They are discovered in much the same way that a dog-owner discovers
poop in the yard in springtime when the snow melts4.

    
Many people dread spring chores, but spring also brings the flowers. The
children play, the dog chases a butterfly, the birds sing...and it all seems
suddenly worthwhile. Likewise with the work we have ahead. Things are not really
too bad, once you've gotten started.


2.1.1 GETTING STARTED

This part of the book will cover the basics of SMB, enumerate and describe some
of the SMB message types (commands), discuss protocol dialects, give some
details on authentication, and provide a few examples. That should be enough to
help you develop a working knowledge of the protocol, a working SMB client, and
possibly a simple server.

Bear in mind, though, that SMB is more complex and less well defined than NBT.
In the NBT section it was possible to describe every message type and provide a
comprehensive review of the entire NBT protocol. It is not practical to cover
all of SMB in the same way. Instead, the goal here is to explain the basics of
SMB, provide details that are missing from other sources, and describe how to go
about exploring SMB on your own. In other words, the goal is to develop
understanding rather than simply providing knowledge.

The textbook for this class is the latest version of the SNIA CIFS Technical
Reference. Additional sources are listed in the References section near the end
of this book. The most important tool, however, is probably the protocol
analyzer. Warm up your copy of Ethereal or NetMon, and get ready to do some
packet shoveling.


2.1.2 NBT OR NOT NBT

Before we actually start, there is one more thing to mention: The SMB protocol
is supposed to be "transport independent". That is, SMB should work over any
reliable transport that meets a few basic criteria. NBT is one such transport,
but SMB does not really require the NetBIOS API. It can, for instance, be run
directly over TCP/IP.

Just for fun, we will refer to SMB over TCP/IP without NBT as "naked" or "raw".
When running naked, SMB defaults to using TCP port 445 instead of the NBT
Session Service port (TCP/139). Windows2000, WindowsXP, and Samba all support
raw transport, but the large number of "legacy" Windows clients still in use
suggest that NBT will not go away any time soon.

Other than the new port number, there are only two notable changes between NBT
and naked transport. The first is that naked transport does not make use of the
NBT SESSION REQUEST and POSITIVE SESSION RESPONSE messages. The second is that
the two transports interpret the SESSION MESSAGE header a bit differently.

Recall (from section 1.6) that the NBT Session Service prepends a four-byte
header to each SESSION MESSAGE, like so:



0 1 2 3 4 5 6 7 8 9 1
0 1
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 2
0 2
1 2
2 2
3 2
4 2
5 2
6 2
7 2
8 2
9 3
0 3
1 0 (zero) <reserved> LENGTH (17 bits)

The LENGTH field, as shown, is 17 bits wide5. Raw TCP transport also prepends a
four-byte header, but the full 24 bits are available for the LENGTH:



0 1 2 3 4 5 6 7 8 9 1
0 1
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 2
0 2
1 2
2 2
3 2
4 2
5 2
6 2
7 2
8 2
9 3
0 3
1 0 (zero) LENGTH (24 bits)


 

Your mileage may vary.
-- advertiser's disclaimer

  

Appendix B of the SNIA CIFS Technical Reference is the only source that was
found which clearly shows the naked transport LENGTH field as being 24 bits
wide. 24 bits translates to 16 megabytes, though, and that's a bigbunch--more
than is typically practical. Fortunately, the actual maximum message size is
something that is negotiated when the client and server establish the session.

When we discuss the SMB messages themselves we will ignore the SESSION MESSAGE
headers, since they are part of the transport, not the SMB protocol.



--------------------------------------------------------------------------------


2.2 AN INTRODUCTORY TOUR OF SMB

We will start with a quick museum tour of SMB. Our guide will be the venerable
Universal Naming Convention (UNC). You may remember UNC from the brief
introduction way back in section 1.1. UNC will provide directions and point out
highlights along the tour.

Please stay together, everyone.

The UNC directions are presented in terms of a path, much like the Uniform
Resource Identifier (URI) paths that are used on the World Wide Web. To explain
UNC, let us first consider something more modern and familiar:



> http://ubiqx.org/cifs/SMB.html

That string is in URI syntax, as used by web browsers, and it breaks down to
provide these landmarks:



http == The protocol to use. ubiqx.org == The name of the server. cifs == The
directory path. SMB.html == The file name.



The landmarks guide us along a path which eventually leads us to the file we
wanted to access.

The SMB protocol pre-dates the use of URIs and was originally designed for use
on LANs, not internetworks, so it naturally has a different (though surprisingly
similar) way of specifying paths. A Universal Naming Convention (UNC) path
comparable to the URI path above might look something like this:



> \\ubiqx\cifs\SMB.html

...and would parse out like this:



ubiqx == The name of the server. cifs == The directory path. SMB.html == The
file name.



Very similar indeed.
 

The devil is in the details.
-- Popular saying

  

One obvious difference between the two formats is that UNC doesn't provide a
protocol specification. That's not because it always assumes SMB. The UNC format
can support all sorts of filesharing protocols, but it is up to the underlying
operating system or application to try to figure out which one to use. Protocol
and transport discovery are handled by trial-and-error, with each possibility
tested until something works. As you might imagine, a system with AppleTalk,
NetWare, and SMB all enabled may have a lot of work to do.

The UNC format is handled natively by Microsoft & IBM's extended family of
operating systems: DOS, OS/2, and Windows6. Samba's smbclient utility can also
parse UNC names, but it does so at the application level rather than within the
OS and it only ever tries to deal with SMB. Even so, smbclient must handle both
NBT and naked transport, which can be tricky.


2.2.1 THE SERVER IDENTIFIER

The first stop on our UNC tour of SMB is the server name field, which is really
a server identifier field because it will accept addresses in addition to names.
This book concerns itself with only two transports--NBT and naked TCP
transport--so the only identifiers we care about are:



 * NetBIOS names,
 * DNS names, and
 * IP addresses.

NetBIOS and DNS names both resolve to IP addresses, so all three are equivalent.

Sort of...

Recall that the NBT SESSION REQUEST packet requires a CALLED NAME in order to
set up an NBT session with the server. Without a correct CALLED NAME, the NBT
SESSION REQUEST may be rejected (different implementations behave differently).
So...



 * if the transport is NBT (not raw),
 * and the server is identified using a DNS name or IP address...

...then we're in a bit of a pickle. How do we find the correct NetBIOS name to
put into the CALLED NAME field? There really is no "right" way to reverse-map an
IP address to a particular NetBIOS service name. The solution to this problem
involves some guessing, and it's not pretty. We will go into detail when we
discuss the interface between SMB and the transport layer.

Of course, if SMB is running over raw transport then there is no NBT SESSION
REQUEST message and, therefore, no CALLED NAME. In that case, the NetBIOS name
isn't needed at all, which saves a lot of fuss and bother.


2.2.2 THE DIRECTORY PATH



A path! A path!
-- The Knights Who Say Ni
Monty Python And The
Holy Grail, Monty Python's
Flying Circus

  

The directory path looks just like a directory path, but there is one small
thing that makes it different. That thing is called the "share name".

Whenever a resource is made available (shared) via SMB it is given a share name.
The share name doesn't need to be the same as the actual name of the object
being shared as it exists on the server. For example, consider the directory
path below:



> /dogs/corgi/stories/jolyon/

Suppose we just want to share the /stories subdirectory. If we simply call it
"stories" no one will know what kind of stories it contains, so we should give
it a more descriptive name. We might, for example, call it "dogbytes".

The share name takes the place of the actual directory name when the share is
accessed via SMB. If the server is named "petserver", then the UNC path to the
same directory would be:



> \\petserver\dogbytes\jolyon\

As shown in figure 2.1, there can be more than one share name pointing to the
same directory and access rules may be applied on a per-share basis. The idea is
similar, in some ways, to that of symbolic links (symlinks) in Unix, or
shortcuts in Windows. The share is a named pointer--with its own set of
attributes--to the object being made available by the server.

[Figure 2.1]


2.2.3 THE FILE

This is the last stop on our quick UNC tour of SMB.

Files, like directories, should be fairly familiar and fairly straight-forward.
As has been continually demonstrated, however, things in the CIFS world are not
always as simple as they ought to be. Our point of interest on this part of the
tour is the distinction between server filesystem syntax and semantics and
client expectations...a very gnarled knot for CIFS implementors.

Consider, for example, a bunch of Windows clients connecting to an SMB server
running on Linux. On the Linux system the filenames Corgi, corgi, and CORGI
would all be distinct because Linux filesystems are typically case-sensitive.
Windows, however, expects filenames to be case-insensitive, so all three names
are the same from the Windows point of view. Thus, we have a conflict. How does
a Linux server make all three files available to the Windows client?

Other difficult issues include:



 * filename lengths,
 * valid characters,
 * file access permissions, and
 * the end-of-line delimiter in text files.

These are complex problems, not easily solved. The CIFS protocol suite is not
designed to be agnostic with regard to such things. In fact, CIFS goes out of
its way at times to support features that are specific to DOS, OS/2, and
Windows.

...and that concludes our tour. It's time to visit the gift shoppe.


2.2.4 THE SMB URL

The UNC format is specific to one family of operating systems. Earlier on,
though, we compared UNC with the more portable and modern URI format. That's
called foreshadowing. It's a literary trick used to build suspense and
anticipation.

There is, in fact, such a thing as an SMB URL. It fits into the general URI
syntax7 and can be used to specify files, directories, and other SMB-shared
stuff. It is intended as a more portable, and more complete way to specify SMB
paths at the application level.

As of this writing, the SMB URL is only documented in an IETF Internet Draft,
and is not yet any kind of standard. That hasn't stopped folks from implementing
it, though. The SMB URL is supported in a wide variety of products including the
KDE and GNOME desktop GUI environments, web browsers such as Galeon and
Konqueror, and Open Source CIFS projects like jCIFS and libsmbclient (the latter
is included with Samba). Thursby Software and Apple Computer also make use of
the SMB URL in their commercial CIFS implementations.

That's good news for CIFS implementors because it means that there is an
accepted, cross-platform way to identify SMB-shared resources, both within LANs
and across the Internet.


2.2.5 WAS THAT TRIP REALLY NECESSARY?

Our quick UNC tour provided an introduction to some of the basic concepts, and
annoyances, of SMB. We will expand upon those ideas as we dig more deeply into
the protocol. The UNC format itself is also important for a variety of reasons,
both historical and practical. Not least among these is that UNC strings are
used within some of the SMB messages that cross the wire.

The SMB URL format is equally significant. It is portable, flexible, and gaining
in popularity. It will also form the basis for examples given later in the text.
If you are implementing an SMB client, you will most likely want to have some
convention for identifying resources. You could invent your own, or use UNC, but
the SMB URL is probably your best option.



--------------------------------------------------------------------------------


2.3 FIRST CONTACT: REACHING THE SERVER



Getting there is half the fun.
-- Unknown   

We are approaching this thing in layers. A little history, a quick introductory
tour...and now this. It may seem like a bit of a diversion, but the goal in this
section is to figure out how a client finds the server and initiates a
connection. No, we're not dealing with SMB protocol yet, but we can't send SMB
messages until we can talk to a server.

Think of a telephone call. If you want to call your cousin in New York the first
thing you need to know is the telephone number. You could ask your uncle for the
number or look it up in the telephone book, or perhaps you have it written on a
scrap of paper somewhere in the kitchen with your favorite tofu recipes. If you
dial the wrong number you will annoy some guy in a gas station in Brooklyn. When
you dial the correct number, the underlying system will go through a complex
process to set up the connection so that you can start talking to your cousin
(or, more likely, to the answering machine).

...and if you want to connect with an SMB server you might need to resolve a
NetBIOS or DNS name to an IP address. Once you have the address, you can attempt
to open a session with the server.

Consider this simple SMB URL:



> smb://server/

From the user's perspective, that should be enough to build an initial
connection to an SMB server named "server".

From an implementation point of view, the first thing to do with this example is
to parse out the "server" substring. In URI parlance, the field we are looking
for is called the "host non-terminal"8, and it contains the name or address of
the server to which we are trying to connect. Our term for the parsed-out string
is "Server Identifier". Once we have extracted it, the next thing we need to
know how to do is interpret it so that we can use the information to create the
session.


2.3.1 INTERPRETING THE SERVER IDENTIFIER

The SMB URL format supports the use of three different identifier types in the
host field. We went over them briefly before. They are the IP address, DNS name,
or NetBIOS name of the destination. Our next task is to figure out which is
which.
 

If you want something done right
you have to do it yourself.
-- Well-known axiom   

Presentation is everything, and it turns out that the code for interpreting the
Server Identifier is verbose and tedious. Most of the busywork for handling
NetBIOS names was covered in section 1, and there are plenty of tools for
dealing with IP addresses and DNS names, so to save time we will describe how to
interpret and resolve the address (and let you write the code yourself9).



It could be an IP address.

Check the syntax of the input to determine whether it is a valid representation
of an IP address. Do this test first. It is quick, and does not involve sending
any queries out over the network. The inet_aton() function, common on Unix-like
operating systems, does the job nicely for the four-byte IPv4 addresses used
today.

IP version 6 (IPv6) addresses are different. They are longer, harder for a human
to read, and potentially more complicated to parse out. Fortunately, when used
in URLs they are always contained within square brackets, as in the following
example:



> smb://[fe80::240:f4ff:fe1f:8243]/

The square brackets are reserved characters, used specifically for this
purpose10. They make it easy to identify an IPv6 IP address. Once identified,
the IPv6 address can be converted into its internal format by the inet_pton()
function, which is now supported by many systems.



You are likely to be eaten
by a grue.
-- Zork, Marc Blank and
David Lebling, InfoCom

   Note that it is, in theory, possible to register a NetBIOS name that looks
exactly like an IP address. What's worse is that it might not be the same as the
IP address of the node that registered it. That's nasty. Anyone who would do
such a thing should have their keyboard taken away. It is probably not important
to handle such situations. Defensive programming practices would suggest being
prepared, but in this case the perpetrators deserve the troubles they cause for
themselves.



It could be a NetBIOS Name.

If the Server Identifier isn't an IP address, it could be a NetBIOS name. To see
if this is the case, the first step is to look for a dot ('.'). The SMB URL
format does not allow un-escaped dots to appear in the NetBIOS name itself, so
if there is a dot character in the raw string then consider the rest of the
string to be a Scope ID. For example:



> smb://my%2Enode.scope/

is made up of the NetBIOS name "MY.NODE" and the Scope ID "SCOPE". (The URL
escape sequence for encoding a dot is %2E.)

Once the string has been parsed into its NetBIOS Name and Scope ID components,
the next thing to do is to send an NBT Name Query. Always use a suffix value of
0x20, which is the prescribed suffix for SMB services. The handling of the query
depends, of course, on whether the client is a B, M, P, or H node. For anything
other than a B node, the IP address of the NBNS is required. Most client
implementations keep such information in some form of configuration file or
database.

If a positive response is received, keep track of the NetBIOS name and returned
IP address. You will need them in order to connect to the server.



It could be a DNS name.

If the Server Identifier is neither an IP address nor a NetBIOS name, try DNS
name resolution. The gethostbyname() function is commonly used to resolve DNS
names to IP addresses, but be warned that this is a blocking function. It may
take quite a while for it to do its job, and your program will do nothing in the
mean time11. That is one reason that it is typically the last thing to try.

That is how to go about determining which kind of Server Identifier you've been
given. Isn't overloading fun? Now you see why the code for handling all of this
is tedious and verbose. It really is not very difficult, though, it's just that
it takes a bit of work to get it all coded up.


2.3.2 THE DESTINATION PORT

Port 139 is for NBT, and port 445 is for raw TCP--good rules of thumb. Recall,
though, that the NBT Session Service provides a mechanism for redirection. In
addition, some security protocols use high-numbered ports to tunnel SMB
connections through firewalls. That means that the use of non-standard ports
should be supported on the client side.

The SMB URL allows the specification of a destination port number, like so:



> smb://server:1928/

Once again, that fits into standard URI syntax. If you spend any time using a
web browser, the port field should be familiar.

What this all means, however, is that the port number does not always indicate
which transport should be used. Rather the opposite; if the port number is not
specified, the default port depends upon the transport. Knowing which transport
to choose is, once again, something that requires some figuring out.


2.3.3 TRANSPORT DISCOVERY

As has been stated previously, we are only considering the NBT and naked TCP
transports. Both of these are IP-based and the behavior of SMB over these two is
nearly identical, so it does not seem as though separating them would be very
important...but this is CIFS we're talking about.

The crux of the problem is whether or not the NBT SESSION REQUEST message is
required. If the server is expecting correct NBT semantics, then we will need to
find a valid NetBIOS name to place into the CALLED NAME field. This is a
complicated process, involving a lot of trial-and-error. The recipe presented
below is only one way to go about it. A good chef knows how to adjust the
ingredients and choose seasonings to get the desired result. This is as much an
art as it is a science.

2.3.3.1 RUN NAKED

Running naked is probably the easiest transport test to try first. The procedure
is tasteful and dignified: simply assume that the server is expecting raw TCP
transport. Open a TCP connection to port 445 on the server, but do not send an
NBT SESSION REQUEST--just start sending SMB messages and see if that works.
There are four possible results from this test:

 1. If nothing is listening on port 445 at the server, the TCP connection will
    fail. If that happens, the client can fall back to using NBT on port 139.

 1. If a non-SMB service is running on the destination port one end or the other
    will (hopefully) figure out that the messages being exchanged are
    incomprehensible, and the connection will be dropped. Again, the fall-back
    is to try NBT on port 139.

 1. The remote end may be expecting NBT transport. This should never happen when
    talking to port 445, but defensive programming practices suggest being
    prepared. If the server requires NBT transport then it will probably reply
    to the initial SMB message by sending an NBT NEGATIVE SESSION RESPONSE.

 1. The connection might, after all, succeed.

All of the above applies if the user did not specify a non-standard port number.
If the input looks more like this:



> smb://server:2891/

...then the option of falling back to NBT on port 139 is excluded. In addition,
there is no way to guess which transport type should be used if a port number
other than 139 or 445 is specified. (In theory, it is also possible to run NBT
transport on port 445 and naked transport on port 139. If you catch anyone doing
such a twisted thing you should probably notify the authorities.)

Fortunately, Windows systems (Windows95, '98, and W2K were tested) return an NBT
NEGATIVE SESSION RESPONSE if they get naked semantics on an NBT service port.
This makes sense, because it lets the client know that NBT semantics are
required. Samba's smbd goes one better and simply ignores the lack of a SESSION
REQUEST message. Samba's behavior effectively merges the two transport types and
makes the distinction between them irrelevant, which simplifies things on the
server side and makes life easier for the client.
 

Real Programmers
don't draw flowcharts.
--Unknown   

The transport discovery process is illustrated using the anachronistic flowchart
presented in figure 2.2.

[Figure 2.2]

2.3.3.2 USING THE NETBIOS NAME

If running naked didn't work, then you will probably need to try NBT transport.
Also, back in section 2.3.1 we talked about the different types of Server
Identifiers that most implementations support. One of those is the NetBIOS name,
and it seems logical to assume that if the Server Identifier is a NetBIOS name
then the transport will be NBT.

That's two good reasons to give NBT transport a whirl.

As stated earlier, the critical difference between the raw TCP and NBT
transports is that NBT requires the SESSION REQUEST/POSITIVE SESSION RESPONSE
exchange before the SMB messages can start flowing. The SESSION REQUEST, in
turn, must contain a valid CALLED NAME. If the CALLED NAME is not correct, then
some server implementations will reject the connection. (Windows seems to be
quite picky, but Samba ignores the CALLED NAME field.)

Finding a valid CALLED NAME is easy if the Server Identifier is a NetBIOS name
because, well... because there you are. The NetBIOS name is the correct CALLED
NAME. Also, since the Server Identifier was resolved via an NBT Name Query, the
server's IP address is known. That's everything you need.

There is one small problem with this scenario that could cause a little trouble:
some NBNS servers can be configured to pass NetBIOS name queries through to the
DNS system, which means that the DNS--not the NBNS--may have resolved the name
to an IP address. That would mean that we have a false-positive and the Server
Identifier is not, in fact, a NetBIOS name. If that happens, you could wind up
trying to make an NBT connection to a system that isn't running NBT services.
(The opposite of the "run naked" test described above.)

Detecting an SMB service that wants naked transport is not as clean and easy as
detecting one that wants NBT. In testing, a Windows2000 system running naked TCP
transport did not respond at all to an NBT SESSION REQUEST, and the client timed
out waiting for the reply. This problem is neatly avoided if naked transport is
attempted before NBT transport. Since Samba considers the SESSION REQUEST
optional, this kind of transport confusion is not an issue when talking to a
Samba server.

2.3.3.3 REVERSE-MAPPING A NETBIOS NAME

Reverse-mapping is the last, desperate means for finding a workable NetBIOS
CALLED NAME so that a valid SESSION REQUEST can be sent. Reverse-mapping is also
quite common. Your code will need to try this technique if naked transport
didn't work and the Server Identifier was a DNS name or IP address--a situation
which is not unusual.

As stated before, there is no right way to do reverse-mapping. Fortunately,
there are a few almost-right ways to go about it. Here they are:



Try a Node Status query.

Send an NBT NODE STATUS QUERY to the server. If it responds, run through the
list of returned names looking for a unique name with a suffix byte value of
0x20. Try using that name as the CALLED NAME when setting up the session. If
there are multiple names with a suffix value of 0x20, try them in series until
you get a POSITIVE SESSION RESPONSE (or until they all fail).

Stop laughing. It gets better.



Try using the Generic CALLED NAME.

This kludge was introduced in Windows NT4 and has been adopted by many other
implementations. It is fairly common, but not universal.

The generic CALLED NAME is *SMBSERVER<20> (that is, "*SMBSERVER" with a suffix
byte value of 0x20). Think of it as an alias, allowing you to connect to the SMB
server without knowing its "real", registered NetBIOS name. The *SMBSERVER<20>
name starts with an asterisk, which is against the rules, so it is never
registered with the NBT Name Service. If you send a unicast Name Query for this
name, the destination node should always send a NEGATIVE NAME QUERY RESPONSE in
reply (assuming that it is actually running NBT).

A bit awkward but it does work...sometimes. Now for the coup de gras.



 

"Guess," said Marvin.
-- Restaurant at the End
of the Universe,
Douglas Adams

   Try Using the DNS Name.

Try using the first label of the DNS name (the hostname of the server) as the
CALLED NAME. If you were given an IP address you will need to do a reverse DNS
lookup to get a name to play with (we suggested earlier that the DNS name might
come in handy). As always, use a suffix byte value of 0x20.

If the first label doesn't work, try the first two labels (retaining the dot)
and so on until you have a string that is longer than 15 bytes, at which point
you give up.

Yes, there are implementations which actually do this.

If none of those options worked, then it is finally time to send an error
message back to the user explaining that the Server Identifier is no good.



Ignorance is Bliss Omission Alert:


--------------------------------------------------------------------------------

We have not fully discussed IPv6.

As it currently stands, NBT doesn't work with IPv6. All of the IP address fields
in the NBT messages are four-byte fields, but IPv6 addresses are longer. There
has been talk of NetBIOS emulation over IPv6, but if such a thing ever happens
(unlikely) it will take a while before the proposal is worked out and accepted.

Unfortunately, when it comes to SMB over IPv6 the author is clueless. It is
probably just like SMB over naked transport, except that the addresses are IPv6
addresses.
 




2.3.4 CONNECTING TO THE SERVER

We are still dealing with the transport layer and haven't actually seen any SMBs
yet. It is, however, finally time for some code. Listing 2.1 handles the basics
of opening the connection with an SMB server. It is example code so, of course,
it takes a few shortcuts. For instance, it completely side-steps Server
Identifier interpretation and transport discovery (that is, everything we just
covered).

[Listing 2.1]

The code in listing 2.1 provides an outline for setting up the session via NBT
or raw TCP. With that step behind us, we won't have to deal with the details of
the transport layer any longer. Let's run through some code highlights quickly
and put all that transport stuff behind us.



Transport: The program does not attempt to discover which transport to use. As
written, it assumes NBT transport. To try naked transport, simply comment out
the call to RequestNBTSession() in main().



The Command Line: Because we are shamelessly avoiding presenting code that
interprets Server Identifiers, the example program makes the user do all of the
work. The user must enter the NetBIOS name and IP address of the server.
Entering a destination port number is optional.

The name entered on the command line will be used as the CALLED NAME. If the
input string begins with an asterisk, the generic *SMBSERVER<20> name will be
used instead.



The CALLING NAME (NBT source address): The program inserts SMBCLIENT<00> as the
CALLING NAME.

In a correct implementation, the name should be the client's NetBIOS Machine
Name (which is typically the same as the client's DNS hostname) with a suffix
byte value if 0x00.

The contents of the CALLING NAME field are not particularly significant.
According to the expired Leach/Naik CIFS Internet Draft, the same name from the
same IP address is supposed to represent the same client...but you knew that.
Samba can make use of the CALLING NAME via a macro in the smb.conf configuration
file. The macro is used for all sorts of things, including generating per-client
log files.




We leave this as an
exercise for the reader.
-- Unknown    Transporting SMBs: A key feature of this program is the line
within main() which reads:



> /* ** Do real work here. ** */

That's where the SMB stuff is supposed to happen. At that point in the code, the
session has been established on top of the transport layer and it is time to
start moving those Server Message Blocks.

Use the program above as a starting point for building your own SMB client
utility. Add a parser capable of dissecting the UNC or SMB URL format, and then
code up Server Identifier resolution and transport discovery, as described
above. When you have all of that put together, you will have completed the
foundation of your SMB client.



--------------------------------------------------------------------------------


2.4 SMB IN ITS NATURAL HABITAT



Are we there yet?
-- Kids in the back seat.   

We have spent a lot of time and effort preparing for this expedition, and we are
finally ready to venture into SMB territory. It can be a treacherous journey,
though, so before we push ahead we should re-check our equipment.



Test Server

If you are going to start testing, you have to have something at which to fling
packets. When choosing a test server, keep in mind that SMB has grown and
changed and evolved and adapted and mutated over the years. You want a server
that can be configured to meet your testing needs. Samba, of course, is highly
configurable. If you know your way around the Windows Registry, you may have
luck with those systems as well. In particular, you probably want to avoid
strong password encryption during the initial stages. Handling authentication is
a big chunk of work, and it is best to try and reduce the number of simultaneous
problems to a manageable few.



Repetitive Terminology Redundancy Notification Alert Alert:


--------------------------------------------------------------------------------

The SMB server software running on a file server node is known as the "File
Server Service", or just "Server Service".

When running on top of NBT, the Server Service always registers a NetBIOS name
composed of the Machine Name and, of course, a suffix value of 0x20. The Machine
Name is typically--but not necessarily--the same as the DNS host name.
 





Test Client

The next thing you will want is a packet flinger. That is, a working client. You
need this for testing, and to compare behavior when debugging your own client.
Samba offers the smbclient utility, and jCIFS comes with a variety of example
programs. Windows systems all have SMB support built-in. That's quite a
selection from which to choose.



Sniffer

Always your best friend. A good packet analyzer--one with a lot of built-in
knowledge of SMB--will be your trusted guide through the SMB jungle.



Documentation

When exploring NBT we relied upon RFC 1001 and RFC 1002 as if they were ancient
maps, drawn on cracked and drying parchment, handed down to us by those who had
gone before. In the wilds of SMB territory, we will count on the SNIA CIFS
Technical Reference as our primary resource. The old X/Open SMB specification
and the SMB/CIFS documentation available from Microsoft's FTP server will also
come in handy. For the sake of efficiency, from here on out we will be a bit
less formal and refer to the SNIA doc as "the SNIA doc", and the X/Open doc as
"the X/Open doc".



Yet Another Tasty Terminology Treat Alert:


--------------------------------------------------------------------------------

As we have explained, "SMB" is the Server Message Block protocol. It is also
true that "an SMB" is a message. In order to implement SMB, one must learn to
send and receive SMBs.

Got that?
 



Keep in mind that the goal of our first trip into the wilds of SMB-land is to
become familiar with the terrain and to study SMBs in their natural habitat, so
we can learn about their anatomy and behavior. We are not ready yet for a
detailed study of SMB innards. That will come later.


2.4.1 OUR VERY FIRST LIVE SMBS

We need to capture a few SMBs to see what they look like up close. That means
it's time to take a look at the wire and see what's there to be seen. Fire up
your protocol analyzer, and then your SMB client. If you can configure your test
server to allow anonymous connections (no username, no password) it will
simplify things at this stage. If you can't, then things won't run quite as they
are shown below. Don't worry, it will be close enough.

For this example, we will use the Exists.java program that comes with jCIFS. It
is a very simple utility that does nothing more than verify the existence of the
object specified by the given SMB URL string, like so:



 shell


  $ java Exists smb://smedley/home
  smb://smedley/home exists
  $
  

The above shows that we were able to access the HOME share on node SMEDLEY. A
similar test can be performed using Samba's smbclient, or with the NET USE
command under Windows12:



 DOS Prompt


  C:\> net use \\smedley\home
  The command was completed successfully.

  C:\> net use /d \\smedley\home
  The command was completed successfully.

  C:\>
  

Those simple commands will generate the packets we want to capture and study.
Stop your sniffer and take a look at the trace. You should see a chain of events
similar to the following:




  No. Source    Destination       Protocol Info
  --- --------  ----------------  -------- ------------------------------
    1 Marika    255.255.255.255   NBNS     Name query
    2 Smedley   Marika            NBNS     Name query response
    3 Marika    Smedley           TCP      34102 > netbios-ssn [SYN]
    4 Smedley   Marika            TCP      netbios-ssn > 34102 [SYN, ACK]
    5 Marika    Smedley           TCP      34102 > netbios-ssn [ACK]
    6 Marika    Smedley           NBSS     Session request
    7 Smedley   Marika            NBSS     Positive session response
    8 Marika    Smedley           TCP      34102 > netbios-ssn [ACK]
    9 Marika    Smedley           SMB      Negotiate Protocol Request
   10 Smedley   Marika            SMB      Negotiate Protocol Response
   11 Marika    Smedley           SMB      Session Setup AndX Request
   12 Smedley   Marika            SMB      Session Setup AndX Response
   13 Marika    Smedley           TCP      34102 > netbios-ssn [FIN, ACK]
   14 Smedley   Marika            TCP      netbios-ssn > 34102 [FIN, ACK]  
   15 Marika    Smedley           TCP      34102 > netbios-ssn [ACK]



The above is edited output from an Ethereal capture13. The packets were
generated using the jCIFS Exists utility, as described above. In this case jCIFS
was talking to an old Windows95 system, but any SMB server should produce the
same or similar results.

The trace is reasonably simple. The first thing that node MARIKA does is send a
broadcast NBT Name query to find node SMEDLEY, and SMEDLEY responds. Packets 3,
4, & 5 show the TCP session being created. (Note that netbios-ssn is the
descriptive name given to port 139.) Packets 6 and 7 are the NBT SESSION
REQUEST/SESSION RESPONSE exchange, and packet #8 is an ACK message, which is
just TCP taking care of its business.

Packets 9 and 10 are what we want. These are our first SMBs.


2.4.2 SMB MESSAGE STRUCTURE



I never metaphor I couldn't mix.
-- Me   

Figure 2.3 provides an overview of SMB gross anatomy. It shows that SMBs are
composed of three basic parts:



 * the Header,
 * the Parameter Block, and
 * the Data Block.

Either or both of the latter two segments may be vestigial (size == 0) in some
specimens.

[Figure 2.3]

2.4.2.1 SMB MESSAGE HEADER

Starting at the top, the SMB header is arranged like so:



0 1 2 3 4 5 6 7 8 9 1
0 1
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 2
0 2
1 2
2 2
3 2
4 2
5 2
6 2
7 2
8 2
9 3
0 3
1 0xff 'S' 'M' 'B' COMMAND STATUS... ...STATUS FLAGS FLAGS2 EXTRA
...
... TID PID UID MID

We can also dissect the header using the simple syntax presented previously:

    SMB_HEADER
      {
      PROTOCOL  = "\xffSMB"
      COMMAND   = <SMB Command code (one byte)>
      STATUS    = <Status code>
      FLAGS     = <Old flags>
      FLAGS2    = <New flags>
      EXTRA     = <Sometimes used for additional data>
      TID       = <Tree ID>
      PID       = <Process ID>
      UID       = <User ID>
      MID       = <Multiplex ID>
      }

We now have a pair of perspectives on the header structure. Time for some good,
old-fashioned descriptive text.



The PROTOCOL and COMMAND Fields: The SMB header starts off easily enough. The
first four bytes are the protocol identifier string, which always has the same
value: "\xffSMB". It's not particularly clear14 why this is included in the SMBs
but there it is, and it's in all of them.

The next byte is the COMMAND field, which tells us what kind of SMB we are
looking at. In the NEGOTIATE PROTOCOL messages captured above, the COMMAND field
has a value of 0x72 (aka. SMB_COM_NEGOTIATE). The SNIA doc has a list of the
available command codes. That list is probably complete, but this is SMB we are
talking about so you never know...



The STATUS Field: Now things start to get surreally interesting.

DOS and OS/2 use 16-bit error codes, grouped into classes. To accommodate these
codes, the STATUS field is subdivided like so:



> 0 1 2 3 4 5 6 7 8 9 1
> 0 1
> 1 1
> 2 1
> 3 1
> 4 1
> 5 1
> 6 1
> 7 1
> 8 1
> 9 2
> 0 2
> 1 2
> 2 2
> 3 2
> 4 2
> 5 2
> 6 2
> 7 2
> 8 2
> 9 3
> 0 3
> 1 ErrorClass <reserved> ErrorCode

WindowsNT introduced a new set of 32-bit error codes, known as NT_STATUS codes.
These use the entire status field to hold the NT_Status value:



> 0 1 2 3 4 5 6 7 8 9 1
> 0 1
> 1 1
> 2 1
> 3 1
> 4 1
> 5 1
> 6 1
> 7 1
> 8 1
> 9 2
> 0 2
> 1 2
> 2 2
> 3 2
> 4 2
> 5 2
> 6 2
> 7 2
> 8 2
> 9 3
> 0 3
> 1 NT_Status



Be afraid. Be very afraid.
-- Veronica Quaife
(Geena Davis)
The Fly (1986)    With two error code formats from which to choose, the client
and server must confer to decide which set will be used. How that is done will
be explained later on. Error code handling is a large-sized topic with extra
sauce.



FLAGS and FLAGS2: Look around the Web for a copy of a document called
COREP.TXT15. This is probably the earliest SMB documentation that is also easy
to find. In COREP.TXT, you can see that the original SMB header layout reserved
fifteen bytes following the error code field. That 15 bytes has, over time, been
carved up for a variety of uses.

The first formerly-reserved byte is now known as the FLAGS field. The bits of
the FLAGS field are used to modify the interpretation of the SMB. For example,
the highest-order bit is used to indicate whether the SMB is a request (0) or a
response (1).

Following the FLAGS field is the two-byte FLAGS2 field. This set of bits is used
to indicate the use of newer features, such as the 32-bit NT_STATUS error codes.



The EXTRA Field: The EXTRA field takes up most of the remaining
formerly-reserved bytes. It contains two subfields, as shown below:



> 0 1 2 3 4 5 6 7 8 9 1
> 0 1
> 1 1
> 2 1
> 3 1
> 4 1
> 5 1
> 6 1
> 7 1
> 8 1
> 9 2
> 0 2
> 1 2
> 2 2
> 3 2
> 4 2
> 5 2
> 6 2
> 7 2
> 8 2
> 9 3
> 0 3
> 1 PidHigh Signature... ...Signature... ...Signature <unused>

The PidHigh subfield is used to accommodate systems that have 32-bit Process
IDs. The original SMB header format only had room for 16-bit PIDs (in the PID
field, described further on).

The 8-byte Signature subfield is for SMB message signing, which uses
cryptography to protect against a variety of attacks that might be tried by
badguys hoping to gain unauthorized access to SMB shares.

When not in use, these fields must be filled with zeros.



TID, PID, UID, and MID:

TID:  The "Tree ID".
In SMB, a share name typically represents a directory or subdirectory tree on
the server. The SMB used to open a share is called a "Tree Connect" because it
allows the client to connect to the shared [sub]directory tree. That's where the
name comes from. The TID field is used to identify connections to shares once
they have been established.
  PID:  The "Process ID".
This value is set by the client, and is intended as an identifier for the
process sending the SMB request. The most important thing to note regarding the
PID is that file locking and access modes are maintained relative to the value
in this field.

The PID is 16 bits wide, but it can be extended to 32 bits using the
EXTRA.PidHigh field described earlier.
 

UID:  The "User ID"
This is also known as a VUID (Virtual User ID). It is assigned by the server
after the user has authenticated and is valid until the user logs off. It does
not need to be the user's actual User ID on the server system. Think of it as a
session token assigned to a successful logon.
  MID:  The "Multiplex ID".
This is used by the client to keep track of multiple outstanding requests. The
server must echo back the MID and the PID provided in the client request. The
client can use those values to make sure that the reply is matched up to the
correct request.

The TID and [V]UID are assigned and managed by the server, while the PID and MID
are assigned by the client. It is important to note that the values in these
fields do not necessarily have any meaning outside of the SMB connection. The
PID, for example, does not need to be the actual ID of the client process. The
client and server assign values to these fields in order to keep track of
context, and that's all.

2.4.2.2 SMB MESSAGE PARAMETERS

In the middle of the SMB message are two fields labeled WordCount and Words[].
For our purposes, we will identify these two fields as being the SMB_PARAMETERS
block, which looks like this:



> 0 1 2 3 4 5 6 7 8 9 1
> 0 1
> 1 1
> 2 1
> 3 1
> 4 1
> 5 1
> 6 1
> 7 1
> 8 1
> 9 2
> 0 . . . WordCount Words...



    SMB_PARAMETERS
      {
      WordCount         = <Number of words in the Words array>
      Words[WordCount]  = <SMB parameters; varies with SMB command>
      }

The Words field is simply a block of data that is 2 × WordCount bytes in length.
Perhaps at one time the intention was that it would contain only two-byte values
(a quick look at COREP.TXT suggests that this is the case). In practice, all
sorts of stuff is thrown in there.

Each SMB message type (species?) has a different record structure that is
carried in the Words block. Think of that structure as representing the
parameters passed to a function (the function identified by the SMB command code
listed in the header).

2.4.2.3 SMB MESSAGE DATA

Following the SMB_PARAMETERS is another block of data, the content of which also
varies in structure on a per-SMB basis:



> 0 1 2 3 4 5 6 7 8 9 1
> 0 1
> 1 1
> 2 1
> 3 1
> 4 1
> 5 1
> 6 1
> 7 1
> 8 1
> 9 2
> 0 . . . ByteCount  Bytes... 



    SMB_DATA
      {
      ByteCount        = <Number of bytes in the Bytes field>
      Bytes[ByteCount] = <Contents varies with SMB command>
      }

The Bytes field holds the data to be manipulated. For example, it may contain
the data retrieved in response to a READ operation, or the data to be written by
a WRITE operation. In many cases, though, the SMB_DATA block is just another
record structure with several subfields. Through time, SMB has evolved lazily
and any functional distinction that may have separated the Parameter and Data
blocks has been blurred.

Note that the SMB_DATA.ByteCount field is an unsigned short, while the
SMB_PARAMETERS.WordCount field is an unsigned byte. That means that the
SMB_PARAMETERS.Words block is limited in length to 510 bytes (2 × 255), while
SMB_DATA.Bytes may be as much as 65535 bytes in length. If you add all that up,
and then add in the SMB_PARAMETERS.WordCount field, the SMB_DATA.ByteCount
field, and the size of the header, you will find that the whole thing fits
easily into the 217-1 bytes made available in the NBT SESSION MESSAGE header.


2.4.3 CASE IN POINT: NEGOTIATE PROTOCOL

Now that we have an overview of the structure of SMB messages, we can take a
closer look at our live specimen. Remember packets 9 and 10 from the capture we
made earlier? They show a NEGOTIATE PROTOCOL exchange. Let's get out the
tweezers, the pocket knife, & dad's hammer and see what's inside.

    NEGOTIATE_PROTOCOL_REQUEST
      {
      SMB_HEADER
        {
        PROTOCOL  = "\xffSMB"
        COMMAND   = SMB_COM_NEGOTIATE (0x72)
        STATUS
          {
          ErrorClass = 0x00   (Success)
          ErrorCode  = 0x0000 (No Error)
          }
        FLAGS     = 0x18 (Pathnames are case-insensitive)
        FLAGS2    = 0x8001 (Unicode and long filename support)
        EXTRA
          {
          PidHigh    = 0x0000
          Signature  = 0 (all bytes zero filled)
          }
        TID       = 0 (Not yet known)
        PID       = <Client Process ID>
        UID       = 0 (Not yet known)
        MID       = 2 (often 0 or 1, but varies per OS)
        }
      SMB_PARAMETERS
        {
        WordCount = 0
        Words     = <empty>
        }
      SMB_DATA
        {
        ByteCount = 12
        Bytes
          {
          BufferFormat = 0x02 (Dialect)
          Name         = "NT LM 0.12" (nul terminated)
          }
        }
      }

The breakdown of packet 9 shows the SMB NEGOTIATE PROTOCOL REQUEST as sent by
the jCIFS Exists utility. Other clients will use slightly different values, but
they are all variations on the same theme. Some features worth noting:



 * The COMMAND field has a value of 0x72 (SMB_COM_NEGOTIATE). That's how we know
   that this is a NEGOTIATE PROTOCOL message. We also know that it is a REQUEST
   rather than a RESPONSE because the highest-order bit in the FLAGS field has a
   value of zero (0).



 * The STATUS field is all zeros at this point because we haven't yet done
   anything to cause an error. Also, the error messages are presented in the
   older DOS format. This is because jCIFS is indicating, via a bit in the
   FLAGS2 field, that it is using the DOS format. We'll dig into those bits
   later on.



 * Several fields (the EXTRA.Signature, the TID, and the UID, to name a few)
   contain zeros. The content of these fields has not yet been determined, and
   they may or may not be filled in later on. It all depends upon the types of
   SMB requests that are issued. Stay tuned.



 * In this particular SMB the Parameter block is empty and all of the useful
   information is being carried in the Data block. In contrast, the response
   packet from the server (packet 10) makes use of both the Parameter and Data
   blocks (assuming that there are no errors). See for yourself by looking at
   the NEGOTIATE PROTOCOL RESPONSE in your capture.
   
   The Data block in the request contains the list of protocols that the client
   is able to speak. jCIFS only knows one dialect, so only one name is listed in
   the message above. As you can see, jCIFS implements the "NT LM 0.12" dialect
   (the most recent and widely supported as of this writing). Other clients,
   such as Samba's smbclient, support a longer list of dialects.


2.4.4 THE ANDX MUTATION

In the trace given above, Ethereal has identified packets 11 and 12 as being a
SESSION SETUP ANDX exchange16. The term "ANDX" at the end of the names indicates
that these messages belong to a curious class of creatures known as "AndX
messages". SMB AndX messages are actually several SMBs combined into a single
symbiotic packet as shown in figure 2.4. It is an efficient mutation.
 

<tpot> shouldn't that be an AntX?
-- Tim Potter on IRC   

[Figure 2.4]

AndX messages work something like a linked list. Each Parameter block in an AndX
message begins with the following structure:



> 0 1 2 3 4 5 6 7 8 9 1
> 0 1
> 1 1
> 2 1
> 3 1
> 4 1
> 5 1
> 6 1
> 7 1
> 8 1
> 9 2
> 0 2
> 1 2
> 2 2
> 3 2
> 4 2
> 5 2
> 6 2
> 7 2
> 8 2
> 9 3
> 0 3
> 1 AndXCommand <reserved> AndXOffset

The AndXCommand field provides the SMB command code for the next AndX block in
the list (not the current one). The AndXOffset contains the byte index, relative
to the start of the SMB header, of that next AndX block--think of it as a
pointer. Since the AndXOffset value is independent of the
SMB_PARAMETERS.WordCount and SMB_DATA.ByteCount values, it is possible to
provide padding between the AndX blocks as shown in figure 2.5.

[Figure 2.5]

Now that we have a general idea of what an SMB AndX message looks like we are
ready to dissect packet 11. It looks like this:



    SESSION_SETUP_ANDX_REQUEST
      {
      SMB_HEADER
        {
        PROTOCOL  = "\xffSMB"
        COMMAND   = SMB_COM_SESSION_SETUP_ANDX (0x73)
        STATUS
          {
          ErrorClass = 0x00   (Success)
          ErrorCode  = 0x0000 (No Error)
          }
        FLAGS     = 0x18 (Pathnames are case-insensitive)
        FLAGS2    = 0x0001 (Long filename support)
        EXTRA
          {
          PidHigh    = 0x0000
          Signature  = 0 (all bytes zero filled)
          }
        TID       = 0 (Not yet known)
        PID       = <Client Process ID>
        UID       = 0 (Not yet known)
        MID       = 2 (often 0 or 1, but varies per OS)
        }
      ANDX_BLOCK[0] (Session Setup AndX Request)
        {
        SMB_PARAMETERS
          {
          WordCount     = 13
          AndXCommand   = SMB_COM_TREE_CONNECT_ANDX (0x75)
          AndXOffset    = 79
          MaxBufferSize = 1300
          MaxMpxCount   = 2
          VcNumber      = 1
          SessionKey    = 0
          CaseInsensitivePasswordLength = 0
          CaseSensitivePasswordLength   = 0
          Capabilities  = 0x00000014
          }
        SMB_DATA
          {
          ByteCount     = 20
          AccountName   = "GUEST"
          PrimaryDomain = "?"
          NativeOS      = "Linux"
          NativeLanMan  = "jCIFS"
          }
        }
      ANDX_BLOCK[1] (Tree Connect AndX Request)
        {
        SMB_PARAMETERS
          {
          WordCount       = 4
          AndXCommand     = SMB_COM_NONE (0xFF)
          AndXOffset      = 0
          Flags           = 0x0000
          PasswordLength  = 1
          }
        SMB_DATA
          {
          ByteCount       = 22
          Password        = ""
          Path            = "\\SMEDLEY\HOME"
          Service         = "?????"  (yes, really)
          }
        }
      }

There is a lot of information in that message, but we are not yet ready to dig
into the details. There is just too much to cover all of it at once. Our goals
right now are simply to highlight the workings of the AndX blocks, and to
provide a glimpse inside the SESSION SETUP ANDX & TREE CONNECT ANDX sub-messages
so that we will have something to talk about later on.

The block labeled ANDX_BLOCK[0] is the body of the SESSION SETUP REQUEST, and
ANDX_BLOCK[1] contains the TREE CONNECT REQUEST. Note that the AndXCommand field
in the final AndX block is given a value of 0xFF. This, in addition to the zero
offset in the AndXOffset field, indicates the end of the AndX list.


2.4.5 THE FLOW OF CONVERSATION

SMB conversations start after the session has been established via the transport
layer. As a rule, the client always speaks first. Clients send requests, servers
respond, and that's the way SMB is supposed to work. This is a hard-and-fast
rule which means, of course, that there is an exception. Fortunately, we can
(and will) put off talking about that exception until we talk about
Opportunistic Locks (OpLocks).

The NEGOTIATE PROTOCOL REQUEST/RESPONSE is always the first SMB exchange in the
conversation. The client and server need to know what language to speak before
they can say anything else. This is also a hard-and-fast rule, but there are no
exceptions (which is an exception to the rule that all hard-and-fast rules have
exceptions).

Once the dialect has been selected, the next formality is to establish an SMB
session using the SMB SESSION SETUP REQUEST message. We keep running into
terminology twists, and here we have yet another. The SMB SESSION SETUP exchange
sets up an SMB session within the NBT or naked TCP session.

Huh?

Well, yes, that's confusing. The problem is that we are talking about two
different kinds of sessions here.



 * There is the network session built at layer 5 of the OSI model, on top of the
   transport layer.



 * There is the user logon session.

Ah, there's a clue! The SESSION SETUP is used to perform authentication and
establish a user session with the server17. A quick look at the SESSION SETUP
ANDX REQUEST block in the packet above shows that the Exists utility did in fact
send a username--the name "GUEST", passed via the AccountName field--to the
server.

Once the user session is established, the client may try to connect to a share
using a TREE CONNECT SMB. It is a hard-and-fast rule that TREE CONNECT SMBs must
follow the SESSION SETUP. There is an exception to this as well, which we will
cover when we get to share-mode vs. user-mode authentication.

[Figure 2.6]

Figure 2.6 shows the right way to start an SMB conversation. Combining the
SESSION SETUP ANDX and TREE CONNECT ANDX SMBs into a single AndX message is
optional (jCIFS' Exists does, but Samba's smbclient doesn't). Once the
conversation has been initiated using the above sequence, the client is free to
improvise.


2.4.6 A LITTLE MORE CODE

There is another small detail you may have noticed while studying the captured
SMB packets--or perhaps you remember this from one of the !Alert boxes in the
NBT section: SMBs are written using little-endian byte order. If your target
platform is big-endian, or if you want your code to be portable to big-endian
systems, you will need to be able to handle the conversion between host and SMB
byte order.

The htonl(), htons(), ntohl(), and ntohs() functions won't help us here. They
convert between host and network order. We need to be able to convert between
host and SMB order (and SMB order is definitely not the same as network order).

So, to solve the problem, we need a little bit of code, which is presented here
mostly to get it out of the way so that we won't have to bother with it when we
are dealing with more complex issues. The functions in Listing 2.2 read short
and long integer values directly from incoming message buffers and write them
directly to outgoing message buffers.

[Listing 2.2]


2.4.7 TAKE A BREAK

Our field trip into SMB territory is now over. We have covered a lot of ground,
collected samples, and taken a look at SMBs in the wild. Our next step will be
doing the lab work, studying our specimens under a microscope. It is time to
take a break, relax, and reflect on what we have learned so far.

Time for a cup of tea.

In the next section we will go back over the SMB header in a lot more detail
with the goal of explaining some of the key concepts that we have only touched
on so far. You will probably want to be well rested and in a good mood for that.



--------------------------------------------------------------------------------


2.5 THE SMB HEADER IN DETAIL

During that first expedition into SMB territory we continually deferred studying
the finer details of the SMB header, among other things. We were trying to cover
the general concepts, but now we need to dig into the guts of SMB to see how
things really work. Latex gloves and lab coats required.

Let's start by revisiting the header layout. Just for review, here's what it
looks like:



0 1 2 3 4 5 6 7 8 9 1
0 1
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 2
0 2
1 2
2 2
3 2
4 2
5 2
6 2
7 2
8 2
9 3
0 3
1 0xff 'S' 'M' 'B' COMMAND STATUS... ...STATUS FLAGS FLAGS2 EXTRA
...
... TID PID UID MID

The first four bytes are constant, so we won't worry about those. The COMMAND
field is fairly straight-forward too; it's just a one byte field containing an
SMB command code. The list of available codes is given in section 5.1 of the
SNIA doc. The rest of the header is where the fun lies....


2.5.1 THE SMB_HEADER.STATUS FIELD EXPOSED

Things get interesting starting at the STATUS field. It wouldn't be so bad
except for the fact that there are two possible error code formats to consider.
There is the DOS & OS/2 format, and then there is the NT_STATUS format. In
C-language terms, the STATUS field looks something like this:

    typedef union
      {
      ulong NT_Status;
      struct
        {
        uchar  ErrorClass;
        uchar  reserved;
        ushort ErrorCode;
        } DosError;
      } Status;

From the client side, one way to deal with the split personality problem is to
use the DOS codes exclusively18. These are fairly well documented (by SMB
standards), and should be supported by all SMB servers. Using DOS codes is
probably a good choice, but there is a catch... There are some advanced features
which simply don't work unless the client negotiates NT_STATUS codes.
 


 
Rats!
-- Charlie Brown
Peanuts, by Charles Schultz   



Strange Behavior Alert:


--------------------------------------------------------------------------------

If the client negotiates Extended Security with a Windows2000 server and also
negotiates DOS error codes, then the SESSION SETUP ANDX will fail, and return a
DOS hardware error. (!?)

    STATUS
      {
      ErrorClass = 0x03   (Hardware Error)
      ErrorCode  = 0x001F (General Error)
      }

Perhaps W2K doesn't know which DOS error to return, and is guessing. The bigger
question is: why does this fail at all?

The same SMB conversation with the NT_STATUS capability enabled works just fine.
Perhaps, when the coders were coding that piece of code, they assumed that only
clients capable of using NT_STATUS codes would also use the Extended Security
feature. Perhaps that assumption came from the knowledge that all Windows
systems that could handle Extended Security would negotiate NT_STATUS. We can
only guess...

This is one of the oddities of SMB, and another fine bit of forensic SMB
research by Andrew Bartlett of the Samba Team.
 



Another reason to support NT_STATUS codes is that they provide finer-grained
diagnostics, simply because there are more of them defined than there are DOS
codes. Samba has a fairly complete list of the known NT_STATUS codes, which can
be found in the samba/source/include/nterr.h file in the Samba distribution. The
list of DOS codes is in doserr.h in the same directory.

We have already described the structure of the DOS error codes. NT_STATUS codes
also have a structure, and it looks like this:



0 1 2 3 4 5 6 7 8 9 1
0 1
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 2
0 2
1 2
2 2
3 2
4 2
5 2
6 2
7 2
8 2
9 3
0 3
1 Level <reserved> Facility ErrorCode

In testing, it appears as though the Facility field is always set to zero
(FACILITY_NULL) for SMB errors. That leaves us with the Level and ErrorCode
fields to provide variety ... and, as we have suggested, there is quite a bit of
variety. Samba's nterr.h file lists over 500 NT_STATUS codes, while doserr.h
lists only 99 (and some of those are repeats).

Level is one of the following:



> 00 == Success
> 01 == Information
> 10 == Warning
> 11 == Error

Since the next two bits (the <reserved> bits) are always zero, the highest-order
nibble will have one of the following values: 0x0, 0x4, 0x8, or 0xC. At the
other end of the longword, the ErrorCode is read as an unsigned short (just like
the DOS ErrorCode field).

The availability of Samba's list of NT_STATUS codes makes things easy. It took a
bit of doing to generate that list, however, as most of the codes are not
documented in an accessible form. Andrew Tridgell described the method below,
which he used to generate a list of valid NT_STATUS codes. His results were used
to create the nterr.h file used in Samba.




Tridge's Trick:

--------------------------------------------------------------------------------

 1. Modify the source of Samba's smbd daemon so that whenever you try to delete
    a file that matches a specific pattern it will return an NT_STATUS error
    code. (Do this on a testing copy, of course. This hack is not meant for
    production.) For example, return an error whenever the filename to be
    deleted matches "STATUS_CODE_HACK_FILENAME.*". Another thing to do is to
    include the specific error number as the filename extension, so that the
    name
    
    
    
    > STATUS_CODE_HACK_FILENAME.0xC000001D
    
    will cause Samba to return an NT_STATUS code of 0xC000001D.



 1. Create the files on the server side first so you have something to delete.
    That is easily done with a shell script, such as this:
    
        #!/bin/bash
        #
        i=0;j=256
        while [ $i -lt $j ]
        do
          touch `printf "STATUS_CODE_HACK_FILENAME.0xC000%.4x" $i`
          i=`expr $i + 1`
        done
    
    Change the values of i and j to generate different ranges.



 1. On a WindowsNT or Windows2000 system, mount the Samba share containing the
    generated STATUS_CODE_HACK* files. Next, open a DOS command shell and, one
    by one, delete the files. For each file, Samba should return the specified
    NT_STATUS code...and Windows will interpret the code and tell you what it
    means. If the code is not defined, Windows will tell you that as well.



 1. If you capture the delete transactions using Microsoft's NetMon tool, it
    will show you the symbolic names that Microsoft uses for the NT_STATUS
    codes.
     



Okay, now for the next conundrum...

Servers have it tougher than clients. Consider a server that needs to respond to
one client using DOS error codes, and to another client using NT_STATUS codes.
That's bad enough, but consider what happens when that server needs to query yet
another server in order to complete some operation. For example, a file server
might need to contact a Domain Controller in order to authenticate the user.

The problem is that, no matter which STATUS format the Domain Controller uses
when responding to the file server, it will be the wrong format for one of the
clients. To solve this problem the server needs to provide a consistent mapping
between DOS and NT_STATUS codes.

WindowsNT and Windows2000 both have such mappings built-in but, of course, the
details are not published (a partial list is given in section 6 of the SNIA
doc). Andrew Bartlett used a trick similar to Tridge's in order to generate the
required mappings. His setup uses a Samba server running as a Primary Domain
Controller (PDC), and a Windows2000 system providing SMB file services. A third
system, running Samba's smbtorture testing utility, acts as the client. When the
client system tries to log on to the Windows server, Windows passes the login
request to the Samba PDC.

The test works like this:




Andrew Bartlett's Trick:

--------------------------------------------------------------------------------



 1. Modify Samba's authentication code to reject login attempts for any username
    beginning with "0x". Translate the login name (eg. "0xC000001D") into an
    NT_STATUS code, and return that in the STATUS field.



 1. Configure smbtorture to negotiate DOS error codes. Aim smbtorture at the W2K
    SMB server and try logging in as user 0xC0000001, 0xC0000002... etc.



 1. For each login attempt from the client, the Windows SMB server will receive
    a login failure message from the Samba PDC. Since smbtorture has requested
    DOS error codes, the W2K pickle-in-the-middle is forced to translate the
    NT_STATUS values into DOS error codes...and that's how you can discover
    Microsoft's mapping of NT_STATUS codes to DOS error codes.

The test configuration is shown in figure 2.7.
 



[Figure 2.7]

Andrew's test must be rerun periodically. The mappings have been known to change
when Windows service packs are installed. See the file
samba/source/libsmb/errormap.c in the Samba distribution for more fun and
adventure19.


2.5.2 THE FLAGS AND FLAGS2 FIELDS TELL ALL

Most (but not all) of the bits in the older FLAGS field are of interest only to
older servers. They represent features that have been superseded by newer
features in newer servers. It would be nice if all of the old stuff would just
go away so that we wouldn't have to worry about it. It does seem, in fact, as
though this is slowly happening. (Maybe it would be better if the old stuff
stayed and the new stuff had never happened. Hmmm...)
 

Duh... dat sounds logical!
-- Baby Huey
Harvey Entertainment   

In any case, this next table presents the FLAGS bits in order of descending
significance--the opposite of the order used in the SNIA doc. English speaking
people tend to read from left to right and from top to bottom, so it seems
logical (as this book is, more or less, written in English20) to transpose
left-to-right order into a top-to-bottom table.



SMB_HEADER.FLAGS Bit # Name / Bitmask / Values Description 7 
SMB_FLAGS_SERVER_TO_REDIR
0x80

0: request
1: reply

What an awful name!
On DOS, OS/2, and Windows systems, the client is built into the operating system
and is called a "redirector", which is where the "SERVER_TO_REDIR" part of the
name comes from. Basically, though, this is simply the reply flag.

6  SMB_FLAGS_REQUEST_BATCH_OPLOCK
0x40

0: Exclusive
1: Batch

Obsolete.
If bit 5 is set, then bit 6 is the "batch OpLock" (aka. OPBATCH) bit. Bit 6
should be clear if bit 5 is clear.

In a request from the client, this bit is used to indicate whether the client
wants an exclusive OpLock (0) or a batch OpLock (1). In a response, this bit
indicates that the server has granted the batch OpLock.

OpLocks (opportunistic locks) will be covered later.

This bit is only used in the deprecated SMB_COM_OPEN, SMB_COM_CREATE, and
SMB_COM_CREATE_NEW SMBs. It should be zero in all other SMBs. The
SMB_COM_OPEN_ANDX SMB has a separate set of flags that handle OpLock requests,
as does the SMB_COM_NT_CREATE_ANDX SMB.

5  SMB_FLAGS_REQUEST_OPLOCK
0x20

0: no OpLock
1: OpLock

Obsolete.
This is the "OpLock" bit. If this bit is set in a request, it indicates that the
client wants to obtain an OpLock. If set in the reply, it indicates that the
server has granted the OpLock.

OpLocks (opportunistic locks) will be covered later.

This bit is only used in the deprecated SMB_COM_OPEN, SMB_COM_CREATE, and
SMB_COM_CREATE_NEW SMBs. It should be zero in all other SMBs. The
SMB_COM_OPEN_ANDX SMB has a separate set of flags that handle OpLock requests,
as does the SMB_COM_NT_CREATE_ANDX SMB. (Sigh.)

4  SMB_FLAGS_CANONICAL_PATHNAMES
0x10

0: Host format
1: Canonical

Obsolete.
This was supposed to be used to indicate whether or not pathnames in SMB
messages were mapped to their "canonical" form. Thing is, it doesn't do much
good to write a client or server that doesn't map names to the canonical form
(which is basically DOS, OS/2, or Windows compatible). This bit should always be
set (1). 3  SMB_FLAGS_CASELESS_PATHNAMES
0x08

0: case-sensitive
1: caseless

When this bit is clear (0), pathnames should be treated as case-sensitive. When
the bit is set, pathnames are considered caseless.

All good in theory. The trouble is that some systems assume caseless pathnames
no matter what the state of this bit. Best practice on the client side is to
leave this bit set (1) and always assume caseless pathnames.

2  0x04 <Reserved> (must be zero)
...well, sort of. This bit is clearly listed as "Reserved (must be zero)" in
both the SNIA and the X/Open docs, yet the latter contains some odd references
to optionally using this bit in conjunction with OpLocks. It's probably a typo.
Best bet is to clear it (0) and leave it alone. 1  SMB_FLAGS_CLIENT_BUF_AVAIL
0x02

0: Not posted
1: Buffer posted

Obsolete.
This was probably useful with other transports, such as NetBEUI. If the client
sets this bit, it is telling the server that it has already posted a buffer to
receive the server's response. The expired Leach/Naik Internet Draft says that
this allows a "send without acknowledgment" from the server.

This bit should be Clear (0) for use with NBT and naked TCP transports.

0  SMB_FLAGS_SUPPORT_LOCKREAD
0x01

0: Not supported
1: Supported

Obsolete.
If this bit is set in the SMB NEGOTIATE PROTOCOL RESPONSE, then the server
supports the deprecated SMB_COM_LOCK_AND_READ and SMB_COM_WRITE_AND_UNLOCK SMBs.
Unless you are implementing outdated dialects, this bit should be clear (0).

The NEGOTIATE PROTOCOL REQUEST that we dissected back in section 2.4.3 shows
only the SMB_FLAGS_CANONICAL_PATHNAMES and SMB_FLAGS_CASELESS_PATHNAMES bits
set, which is probably the best thing for new implementations to do. Testing
with other clients may reveal other workable combinations.

Now let's take a look at the newer flags in the FLAGS2 field.



SMB_HEADER.FLAGS2 Bit # Name / Bitmask / Values Description 15 
SMB_FLAGS2_UNICODE_STRINGS
0x8000

0: ASCII
1: Unicode

If set (1), this bit indicates that string fields within the SMB message are
encoded using a two-byte, little endian Unicode format. The SNIA doc says that
the format is UTF-16LE but some folks on the Samba Team say it's really UCS-2LE.
The latter is probably correct, but it may not matter as both formats are
probably the same for the Basic Multilingual Plane. Doesn't Unicode sound like
fun21?

If clear (0), all strings are in 8-bit ASCII format (by which we actually mean
8-bit OEM character set format).

14  SMB_FLAGS2_32BIT_STATUS
0x4000

0: DOS error code
1: NT_STATUS code

Indicates whether the STATUS field is in DOS or NT_STATUS format. This may also
be used to help the server guess which format the client prefers before it has
actually been negotiated. 13  SMB_FLAGS2_READ_IF_EXECUTE
0x2000

0: Execute != Read
1: Execute confers Read

A quirky little bit this. If set (1), it indicates that execute permission on a
file also grants read permission. It is only useful in read operations. 12 
SMB_FLAGS2_DFS_PATHNAME
0x1000

0: Normal pathname
1: DFS pathname

This is used with the Distributed File System (DFS), which we haven't covered
yet. If this bit is set (1), it indicates that the client knows about DFS, and
that the server should resolve any UNC names in the SMB message by looking in
the DFS namespace. If this bit is clear (0), the server should not check the DFS
namespace. 11  SMB_FLAGS2_EXTENDED_SECURITY
0x0800

0: Normal security
1: Extended security

If set (1), this bit indicates that the sending node understands Extended
Security. We'll touch on this again when we discuss authentication. 10  0x0400
<Reserved> (must be zero) 9  0x0200 <Reserved> (must be zero) 8  0x0100
<Reserved> (must be zero) 7  0x0080 <Reserved> (must be zero) 6 
SMB_FLAGS2_IS_LONG_NAME
0x0040

0: 8.3 format
1: Long names

If set (1), then any pathnames that the SMB contains are long pathnames, else
the pathnames are in 8.3 format. Any new CIFS implementation really should
support long names. 5  0x0020 <Reserved> (must be zero) 4  0x0010 <Reserved>
(must be zero) 3  0x0008 <Reserved> (must be zero) 2 
SMB_FLAGS2_SECURITY_SIGNATURE
0x0004

0: No signature
1: Message Authentication Codes

If set, the SMB contains a Message Authentication Code (MAC). The MAC is used to
authenticate each packet in a session, to prevent various attacks. 1 
SMB_FLAGS2_EAS
0x0002

0: No EAs
1: Extended Attributes

Indicates that the client understands Extended Attributes.

Note that the SNIA doc talks about "Extended Attributes" and about "Extended
File Attributes". These are two completely different concepts. Extended
Attributes are a feature of OS/2. They are mentioned in section 1.1.6 (page 2)
of the SNIA doc and explained in better detail on page 87. Extended File
Attributes are described in section 3.13 (page 30) of the SNIA doc.

The SMB_FLAGS2_EAS bit deals with Extended Attribute support.

0  SMB_FLAGS2_KNOWS_LONG_NAMES
0x0001

0: Client wants 8.3
1: Long pathnames okay

Set by the client to let the server know that long names are acceptable in the
response.

Some of the flags are used to modify the interpretation of the SMB message,
while others are used to negotiate features. Some do both. It may take some
experimentation to find the safest way to handle these bits. Implementations are
not consistent, so new code must be fine-tuned.

You may need to refer back to these tables as we dig further into the details.
Note that the constant names listed above may not match those in the SNIA doc,
or those in other docs or available source code. There doesn't seem to be a lot
of agreement on the names.


2.5.3 EXTRA! EXTRA! READ ALL ABOUT IT!

Um, actually we are going to delay covering the EXTRA field yet again.
EXTRA.PidHigh will be thrown in with the PID field, and EXTRA.Signature will be
handled as part of authentication.


2.5.4 TID AND UID: SEPARATED AT BIRTH?

It would seem logical that the [V]UID and TID fields would be somehow related.
Both are assigned and managed by the server, and we said before that the SESSION
SETUP (where the logon occurs) is supposed to happen before the TREE CONNECT.

Well, put all that aside and pay attention to this little story...




Storytime

--------------------------------------------------------------------------------

Once upon a time there were many, many magic kingdoms taking up office space in
cities and towns around the world. In each of these magic kingdoms there were
lots of overpaid advisors called VeePees. The VeePees were all jealous of one
another, but they were more jealous of the underpaid wizards in the IT
department who had power over the data and could work spells and make the
numbers come out all right.

Then, one day, evil marketing magicians appeared and convinced the VeePees that
they could steal all of the power away from the wizards of IT and have it for
themselves. To do this, the only thing the VeePees would need was a magic box
called a PeeCee (the name appealed to the VeePees). PeeCees, of course, were not
cheap but the lure of power was great and the marketing magicians knew that the
VeePees had control of the budget.

Soon, the wizards of IT discovered that their supplies of mag-tapes and 8-inch
floppies were dwindling, and that no one had bothered to update the service
contracts on their VAXes. Worse, the VeePees started taunting them, saying "We
don't need you any more. We have spreadsheets". The wise wizards of IT smiled
quietly, went back to their darkened cubicles, and entertained themselves by
implementing EMACS in TECO macro language. They did not seem at all surprised
when the VeePees showed up asking questions like "What happens if I format
C-colon?" and "Should I Abort, Retry, or--um--Fail?". The wizards understood
what the VeePees did not: With power there must be equal measures of knowledge
and understanding, otherwise the power will consume the data--and the user.

The marketing magicians, seeing that their golden goose was molting, came up
with a bold plan. They conjured up a LAN system and connected it to a shiny new
fileserver, which they gave to the IT wizards. At first, the wizards were
delighted by the wonderful new server and the beautiful strands of network cable
running all over the kingdom. They quickly realized, however, that they had been
tricked. The client/server architecture had effectively separated authority from
responsibility, and the wizards were left with only the latter.

...and so it is unto this very day. The VeePees and their minions have their
PeeCees and hold the power of the data, but they remain under the influence of
the sinister marketing magicians. The wizards of IT are still underpaid, have
little or no say when decisions are made, and are held responsible and told to
clean up the mess whenever anything goes wrong. A wholly dysfunctional
arrangement.
 



So what the purplebananafish does this have to do with TIDs and UIDs? Well, see,
it's like this...

Early corporate LANs, such as those in our story, were small and self-contained.
The driving goal was to make sure that the data were available to everyone in
the office who could legitimately claim to need access. Security was not
considered a top priority, so PC OSes (eg. DOS) did not support complicated
minicomputer features like user-based authentication. Given the environment, it
is not surprising that the authentication system originally built into SMB was
(by today's standards) quite primitive. Passwords, if they were used at all,
were assigned to shares--not users--and everyone who wanted to access the same
share would use the same password.

This early form of SMB authentication is now known as "Share Level" security. It
does not include the concept of user accounts, so the UID field is always zero.
The password is included in the TREE CONNECT message, and a valid TID indicates
a successfully authenticated connection. In fact, though the UID field is listed
in the SMB message format layout described in the ancient COREP.TXT scrolls, it
is not mentioned again anywhere else in that document. There is no mention of a
SESSION SETUP message either.

There are some interesting tricks that add a bit of flexibility to Share Level
security. For example, a single share may have multiple passwords assigned, each
granting different access rights. It is fairly common, for instance, to assign
separate passwords for read-only vs. read/write access to a share.

Another interesting fudge is often used to provide access to user home
directories. The server (which, in this case, understands user-based
authentication even if the protocol and/or client do not) simply offers
usernames as share names. When a user connects to the share matching their
username, they give their own login password. The server then checks the
username/password pair using its normal account validation routines. Thus,
user-based authentication can be mapped to Share Level security. (See figure
2.8.)

[Figure 2.8]

Share Level security, though still used, is considered deprecated. It has been
replaced with "User Level" security which, of course, makes use of
username/password instead of sharename/password pairs.
 

What a difference
15 years can make
-- John Lewkowicz,
The Complete MUMPS   

Under User Level security, the SESSION SETUP is performed as the authentication
step before any TREE CONNECT requests may be sent. If the logon succeeds, the
server will assign a valid (non-zero) UID. Subsequent TREE CONNECT attempts can
use the UID as an authentication token when requesting access to a share. If
User Level security is in use, the password field in the TREE CONNECT message
will be blank.

So, with User Level security, the client must authenticate to get a valid UID,
and then present the UID to gain access to shares. Thing is, more than one UID
may be generated within a single connection, and the UID used to connect to the
share does not need to be the same as the one used to access files within the
share.


2.5.5 PID AND MID REVEALED

Simply put:



 * a PID identifies a client process,
 * a [PID, MID] pair identifies a thread within a process.

That's the idea, anyway. The client provides values for these fields when it
sends a request to the server, and the server is supposed to echo the values
back in the response. That way, the client can match the reply to the original
request.

Some systems (such as Windows and OS/2) multiplex all of the SMB traffic between
a client and a server over a single TCP connection. If the client OS is
multi-tasking there may be several active SMB sessions running concurrently, so
there may be several requests outstanding at any given time. The SMB
conversations are all intertwined, so the client needs a way to sort out the
replies and hand them off to the correct thread within the correct process. (See
figure 2.9.)

[Figure 2.9]

The PID field is also used to maintain the semantics of local file I/O. Think
about a simple program, like the one in listing 2.3 which opens a file in
read-only mode and dumps the contents. Consider, in particular, the call to the
open() function, which returns a file descriptor. File descriptors are
maintained on a per-process basis--that is, each process has its own private
set. The descriptor is an integer used by the operating system to identify an
internal record that keeps track of lots of information about the open file,
such as:



 * Is the file open for read, write, or both?
 * What is the current file pointer offset within the file?
 * Do we have any locks on the file?

[Listing 2.3]

Now take all of that and stretch it out across a network. The files physically
reside on the server and information about locks, offsets, etc. must be kept on
the server side. The process that has opened the files, however, resides on the
client and all of the file status information is relevant within the context of
that process. That brings us back to what we said before: The PID identifies a
client process. It lets the server keep track of client context, and associate
it correctly with the right customer when the requests come rolling in.

Further complicating things, some clients support multiple threads running
within a process. Threads share context (memory, file descriptors, etc.) with
their sister threads within the same process, but each thread may generate SMB
traffic all on its own. The MID field is used to make sure that server replies
get back to the thread that sent the request. The server really doesn't do much
with the MID. It just echoes it back to the client so, in fact, the client could
make whatever use it wanted of the MID field. Using it as a thread identifier is
probably the most practical thing to do.

There is an important rule which the client should obey with regard to the MID
and PID fields: Only one SMB request should ever be outstanding per [PID, MID]
pair per connection. The reason for this rule is that the client will generally
need to know the result of a request before sending the next request, especially
if an error occurred. The problems which might result should this rule be broken
probably depend upon the server, but defensive programming practices would
suggest avoiding trouble.

2.5.5.1 EXTRA.PIDHIGH DARK SECRETS UNCOVERED

Earlier on we promised to cover the EXTRA.PidHigh field. Well, a promise is a
promise...

The PidHigh field is supposed to be a PID extension, allowing the use of 32-bit
rather than 16-bit values as process identifiers. As with all extensions,
however, there is the basic problem of backward compatibility.

In this case, trouble shows up if (and only if) the client supports 32-bit
process IDs but the server does not. In that situation, the client must have a
mechanism for mapping 32-bit process IDs to 16-bit values that can fit into the
PID field. It doesn't need to be an elaborate mapping scheme, and it is unlikely
that there will be 64K client processes talking to the same server at the same
time, so it should be a simple problem to solve.

Since that mapping mechanism needs to be in place in order for the client to
work with servers that don't support the PidHigh field, there's no reason to use
32-bit process IDs at all. In testing, it appears as though the PidHigh field
is, in fact, always zero (except in some obscure security negotiations that are
still not completely understood). Best bet, leave it zero.


2.5.6 SMB HEADER FINAL REPORT

Code...

The next two listings (2.4a and 2.4b) provide support for reading and writing
SMB message headers. Most of the header fields are simple integer values, so we
can use the smb_Set*() and smb_Get*() functions from listing 2.2 to move the
data in and out of the header buffer. To make subsequent code easier to read, we
provide a set of macros with nice clear names to front-end the function calls
and assignments that are actually used.

[Listing 2.4a]   [Listing 2.4b]

The smb_hdrInit() and smb_hdrCheck() functions are there primarily to ensure
that the SMB headers are reasonably sane. They check for things like the buffer
size, and ensure that the "\xffSMB" string is included correctly in the header
buffer.

Note that none of these functions or macros handle the reading and writing of
the four-byte session header, though that would be trivial. The SESSION MESSAGE
header is part of the transport layer, not SMB. It is handled as a simple
network-byte-order longword; something from the NBT Session Service that has
been carried over into naked transport. (We covered all this back in sections
1.6 and 2.1.2.)



--------------------------------------------------------------------------------


2.6 PROTOCOL NEGOTIATION



This one goes to eleven.
-- Nigel Tufnel
(Christopher Guest),
This Is Spinal Tap   

CIFS is a very rich and varied protocol suite, a fact that is evident in the
number of SMB dialects that exist. Five are listed in the X/Open SMB protocol
specification, and the SNIA doc--published ten years later--lists eleven. That's
a bigbunch, and they probably missed a few. Each new dialect may add new SMBs,
deprecate old ones, or extend existing ones. As if that were not enough,
implementations introduce subtle variations within dialects.

All that in mind, our goal in this section will be to provide an overview of the
available dialects, cover the workings of the NEGOTIATE PROTOCOL SMB exchange,
and take a preliminary peek at some of the concepts that we have yet to consider
(things like virtual circuits and authentication). For the most part, the
examples and discussion will be based on the "NT LM 0.12" dialect. The majority
of the servers currently available support some variation of NT LM 0.12, and at
least one client implementation (jCIFS) has managed to get by without supporting
any others. Server writers should be warned, however, that there really are a
lot of clients still around that use older calls. Even new clients will use
older calls, simply because of the difficulty of acquiring reliable
documentation on the newer stuff.


2.6.1 A SMATTERING OF SMB DIALECTS

In keeping with tradition, the list of dialects is presented as a table with the
dialect name in the left-hand column and a short description in the right,
ordered from oldest to newest. Most of the references to these dialects seem to
do it this way. Our list is not quite as complete as you might find elsewhere.
The aim here is to highlight some of the better-known examples in order to
provide a bit of context for the examination of the SMB_COM_NEGOTIATE message.

Where relevant, important differences between dialects will be noted. It would
be very difficult, however, to try to document all of the features of each
dialect and all of the changes between them. If you really, really need to know
more (which is likely, if you are working on server code) see the SNIA doc, the
X/Open doc, the expired IETF drafts, and the other old Microsoft documentation
that is still freely available from their FTP server22.
 

A language is a dialect
with an army and a navy.
-- Uriel Weinreich   

SMB Dialects Dialect Identifier Notes PC NETWORK PROGRAM 1.0  Also known as the
Core Protocol. This is the original stuff, as documented in COREP.TXT. According
to ancient lore, this dialect is sometimes also identified by the string
"PCLAN1.0". MICROSOFT NETWORKS 1.03  This is the Core Plus Protocol. It extends
a few Core Protocol SMB commands, and adds a few new ones.
MICROSOFT NETWORKS 3.0  Known as the Extended 1.0 Protocol or LAN Manager 1.0.
This dialect was created when IBM and Microsoft were working together on OS/2.
This particular variant was designed for DOS clients, which understood a
narrower set of error codes than OS/2. LANMAN1.0  Identical to the MICROSOFT
NETWORKS 3.0 dialect except that it was intended for use with OS/2 clients, so a
larger set of error codes was available. OS/2 and DOS both expect that the
STATUS field will be in the DOS-style ErrorClass / ErrorCode format. Again, this
dialect is also known as LAN Manager 1.0 or as the Extended 1.0 Protocol.
LM1.2X002  Called the Extended 2.0 Protocol; also known as LAN Manager 2.0. This
dialect represents OS/2 LANMAN version 2.0, and it introduces a few new SMBs.
The identifier for the DOS version of this dialect is "DOS LM1.2X002". As
before, the key difference between the DOS and OS/2 dialects is simply that the
OS/2 version provides a larger set of error codes. LANMAN2.1  Called the LAN
Manager 2.1 dialect (no surprise there), this version is documented in a paper
titled Microsoft Networks SMB File Sharing Protocol Extensions, Document Version
3.4. You can find it by searching the web for a file named "SMB-LM21.DOC". You
will likely need a conversion tool of some sort in order to read the file, as it
is encoded in an outdated form of a proprietary Microsoft format (it's a
word-processing file). The cool thing about the SMB-LM21.DOC document is that
instead of explaining how LANMAN2.1 works it describes how LANMAN2.1 differs
from its predecessor, LANMAN2.0. That's useful for people who want to know how
the protocol has evolved. Samba  You may see this dialect listed in the protocol
negotiation request coming from a Samba-based client such as smbclient, KDE
Konqueror (which uses Samba's libsmbclient library), or the Linux SMBFS
implementation. No one from the Samba Team seems to remember when, or why, this
was added. It doesn't appear to be used any more (if, indeed, it ever was). NT
LM 0.12  This dialect, sometimes called NT LANMAN, was developed for use with
WindowsNT. All of the Windows9x clients also claim to speak it as do Windows2000
and XP. As mentioned above, this is currently the most widely supported. It is,
quite possibly, also the sloppiest with all sorts of variations and differing
implementations. CIFS  Following the release of the IETF CIFS protocol drafts,
many people thought that Microsoft would produce a "CIFS" dialect, and many
documents refer to it. No such beast has actually materialized, however. Maybe
that's a good thing.

Section 3.16 of the SNIA CIFS Technical Reference, V1.0 provides a list of of
SMB message types categorized by the dialect in which they were introduced.
There is also a slightly more complete list of dialects in section 5.4 of the
SNIA doc.


2.6.2 GREETINGS: THE NEGOTIATE PROTOCOL REQUEST

We have already provided a detailed breakdown of a NEGOTIATE PROTOCOL REQUEST
SMB (back in section section 2.4.3), so we don't need to go to the trouble of
fully dissecting it again. The interesting part of the request is the data
section (the parameter section is empty). If we were to write a client that
supported all of the dialects in our chart, the
NEGOTIATE_PROTOCOL_REQUEST.SMB_DATA field would break out something like this:

    SMB_DATA
      {
      ByteCount = 131
      Bytes
        {
        Dialect[0] = "\x02PC NETWORK PROGRAM 1.0"
        Dialect[1] = "\x02MICROSOFT NETWORKS 1.03"
        Dialect[2] = "\x02MICROSOFT NETWORKS 3.0"
        Dialect[3] = "\x02LANMAN1.0"
        Dialect[4] = "\x02LM1.2X002"
        Dialect[5] = "\x02LANMAN2.1"
        Dialect[6] = "\x02Samba"
        Dialect[7] = "\x02NT LM 0.12"
        Dialect[8] = "\x02CIFS"
        }
      }

Each dialect string is preceded by a byte containing the value 0x02. This,
perhaps, was originally intended to make it easier to parse the buffer. In
addition to the 0x02 prefix the dialect strings are nul-terminated, so if you go
to the trouble of counting up the bytes to see if the ByteCount value is correct
in this example don't forget to add 1 to each string length.

Listing 2.5 provides code for creating a NEGOTIATE PROTOCOL REQUEST message. It
also takes care of writing an NBT Session Message header for us--something we
must not forget to do.

[Listing 2.5]


2.6.3 GESUNDHEIT: THE NEGOTIATE PROTOCOL RESPONSE

The NEGOTIATE PROTOCOL RESPONSE SMB is more complex than the request. In
addition to the dialect selection, it also contains a variety of other
parameters that let the client know the capabilities, limitations, and
expectations of the server. Most of these values are stuffed into the
SMB_PARAMETERS block, but there are a few fields defined in the SMB_DATA block
as well.

2.6.3.1 NEGPROT RESPONSE PARAMETERS

The NEGOTIATE_PROTOCOL_RESPONSE.SMB_PARAMETERS.Words block for the NT LM 0.12
dialect is 17 words (34 bytes) in size, and is structured as shown below.
Earlier dialects use a different structure and, of course, the server should
always match the reply to the dialect it selects.

    typedef struct
      {
      uchar WordCount;              /* Always 17 for this struct */
      struct
        {
        ushort DialectIndex;        /* Selected dialect index    */
        uchar  SecurityMode;        /* Server security flags     */
        ushort MaxMpxCount;         /* Maximum Multiplex Count   */
        ushort MaxNumberVCs;        /* Maximum Virtual Circuits  */
        ulong  MaxBufferSize;       /* Maximum SMB message size  */
        ulong  MaxRawSize;          /* Obsolete                  */
        ulong  SessionKey;          /* Unique session ID         */
        ulong  Capabilities;        /* Server capabilities flags */
        ulong  SystemTimeLow;       /* Server time; low bytes    */
        ulong  SystemTimeHigh;      /* Server time; high bytes   */
        short  ServerTimeZone;      /* Minutes from UTC; signed  */
        uchar  EncryptionKeyLength; /* 0 or 8                    */
        } Words;
      } smb_NegProt_Rsp_Params;



She's so dull,
come on rip her to shreds.
-- Rip Her To Shreds
Blondie   

That requires a lot of discussion. Let's tear it up and take a close look at the
tiny pieces.



DialectIndex Things start off fairly simply. The DialectIndex field contains the
index of the dialect string that the server has selected, which will be the
highest-level dialect that the server understands. The dialect strings are
numbered starting with zero, so to choose "NT LM 0.12" from the list in the
example request the server would return 7 in the DialectIndex field.



SecurityMode SecurityMode is a bitfield that provides some information about the
authentication sub-protocol that the server is expecting. Four flag bits are
defined, and are described below. Challenge/Response and Message Authentication
Code (MAC) message signing will be explained later (this is becoming our
mantra), when we cover authentication. It will take a little while to get there,
but keep your eyes open for additional clues along the way.



SecurityMode Bit # Name / Bitmask / Values Description 7-4  0xF0 <Reserved>
(must be zero) 3  NEGOTIATE_SECURITY_SIGNATURES_REQUIRED
0x08

0: Message signing is optional
1: Message signing is required

If set, this bit indicates that the server is requiring the use of a Message
Authentication Code (MAC) in each packet. If the bit is clear then message
signing is optional.

This bit should be zero if the next bit (mask 0x04) is zero.

2  NEGOTIATE_SECURITY_SIGNATURES_ENABLED
0x04

0: Message signing not supported
1: Server can perform message signing

If set, the server is indicating that it is capable of performing Message
Authentication Code (MAC) message signing.

This bit should be zero if the next bit (mask 0x02) is zero.

1  NEGOTIATE_SECURITY_CHALLENGE_RESPONSE
0x02

0: Plaintext Passwords
1: Challenge/Response

This bit indicates whether or not the server supports Challenge/Response
authentication (which will be covered further on). If the bit is clear, then
plaintext passwords must be used. If set, the server may (optionally) reject
plaintext authentication.

If this bit is clear and the client rejects the use of plaintext, then there is
no way to perform the logon and the client will be unable to connect to the
server.

0  NEGOTIATE_SECURITY_USER_LEVEL
0x01

0: Share Level
1: User Level

Ah! Finally something we've already covered!

This bit indicates whether the server, as a whole, is operating under Share
Level or User Level security. Share and User Level security were explained along
with the TID and UID header fields, back in section 2.5.4.



MaxMpxCount Remember the PID and MID fields in the header? They could be used to
multiplex several sessions over a single TCP/IP connection. The thing is, the
server might not be able to handle more than a fixed number of total outstanding
requests.

The MaxMpxCount field lets the server tell the client how many requests, in
total, it can handle concurrently. It is the client's responsibility to ensure
that there are no more than MaxMpxCount outstanding requests in the pipe at any
time. That may mean that client processes will block, waiting for their turn to
send an SMB.



MaxNumberVCs The MaxNumberVCs field specifies the maximum number of Virtual
Circuits (VCs) that the server is able to accommodate. VCs are yet another
mechanism by which multiple SMB sessions could, in theory, be multiplexed over a
single transport-layer session. Note the use of the phrase "in theory". The
dichotomy between theory and practice is a recurring theme in the study of CIFS.



MaxBufferSize MaxBufferSize is the size (in bytes) of the largest message that
the server can receive.

Keep in mind that the transport layer will fragment and defragment packets as
necessary. It is, therefore, possible to send very large SMBs and let the lower
layers worry about ensuring safe, fast, reliable delivery.

How big can an SMB message be?

In the NT LM 0.12 dialect, the MaxBufferSize field is an unsigned longword. As
described much earlier on, however, the Length field in the NBT SESSION MESSAGE
is 17 bits wide and the naked transport header has a 24-bit Length field. So the
session headers place slightly more reasonable limits on the maximum size of a
single SMB message.



MaxRawSize This is the maximum size of a raw data buffer.

The X/Open doc describes the READ RAW and WRITE RAW SMBs, which were introduced
with the Extended 1.0 version of SMB (the MICROSOFT NETWORKS 3.0 and LANMAN1.0
dialects). These were a speed hack. For a large read or write operation, the
first message would be a proper SMB, but subsequent messages would be sent in
"raw" mode, with no SMB or session header. The raw blocks could be as large as
MaxRawSize bytes in length. Once again, the transport layer was expected to take
care of fragmentation/defragmentation and the re-sending of any lost packets.

Raw mode is not used much any more. Among other things, it conflicts with
message signing because the raw messages have no header in which to put the MAC
Signature. Thus, the MaxRawSize field is considered obsolete23.



SessionKey The SessionKey is supposed to be used to identify the session in
which a VC has been opened. Documentation on the use of this field is very poor,
however, and the commentary in various mailing list archives shows that there is
not much agreement about what to do with it.

In theory, the SessionKey value should be echoed back to the server whenever the
client sends a SESSION SETUP request. Samba's smbclient does this, but some
versions of jCIFS always reply with zero, and they don't seem to have any
trouble with it. In testing, it also appears that Windows2000 servers do not
generate a session key. They send zero in NEGOTIATE PROTOCOL RESPONSE messages.
Hmmm...

It would seem that the use of this field was never clearly defined--anywhere by
anyone--and that most servers really don't care what goes there. It is probably
safest if the client echoes back the value sent by the server.



Capabilities This is a grab-bag bitfield, similar in style to the FLAGS and
FLAGS2 fields in the header except, of course, that it is not included in every
message. The bits of the Capabilities field indicate specific server features of
which the client may choose to take advantage.

We are already building up a backlog of unexplained features. We will also
postpone the discussion of the Capabilities field until we get some of the other
stuff out of the way.



SystemTimeLow and SystemTimeHigh The SystemTime fields are shown as two unsigned
longs in the SNIA doc. We might write it as:

    typedef struct
      {
      ulong timeLow;
      ulong timeHigh;
      } smb_Time;

Keeping byte-order in mind, the completed time value should be read as two
little-endian 32-bit integers. The result, however, should be handled as a
64-bit signed value representing the number of tenths of a microsecond since
January 1, 1601, 00:00:00.0 UTC.

WHAT?!?!

Yes, you read that right folks. The time value is based on that unwieldy little
formula. Read it again five times and see if you don't get a headache. Looks as
though we need to get out the protractor, the astrolabe, and the didgeridoo and
try a little calculating. Let's start with some complex scientific equations:



 * 1 microsecond = 10-6seconds
 * 1/10 microsecond = 10-7seconds

In other words, the server time is given in units of 10-7seconds24. Many CIFS
implementations handle these units by converting them into Unix-style
measurements. Unix, of course, bases its time measurements on an equally obscure
date: January 1, 1970, 00:00:00.0 UTC25. Converting between the two schemes
requires knowing the difference (in seconds) between the two base times.


 
 
Proof by Faith: I believe
that this has been proven
...somewhere.
-- Jonathan Young, PhD.   



email

--------------------------------------------------------------------------------

From: Andrew Narver In-Reply-To: A message from Mike Allen sent to Microsoft's
CIFS mailing list and the Samba-Technical mailing list.

> (what's the number of seconds between 1601 and 1970 again?)

Between Jan 1, 1601 and Jan 1, 1970, you have 369 complete years, of which 89
are leap years (1700, 1800, and 1900 were not leap years). That gives you a
total of 134774 days or 11644473600 seconds.
 

So, if you want to convert the SystemTime to a Unix time_t value, you need to do
something like this:



> unix_time = (time_t)(((smb_time)/10000000) - 11644473600);

Which gives you the server's system time in seconds since January 1, 1970,
00:00:00.0 UTC.



ServerTimeZone ServerTimeZone, of course, is the timezone in which the server
believes it resides. It is represented as an offset relative to UTC, in minutes.
Minutes, that is. Multiply by 60 in order to get seconds, or 600,000,000 to get
tenths of a microsecond.

The available documentation (the SNIA doc and the Leach/Naik IETF draft) states
that this field is an unsigned short integer. They're wrong. The field is a
signed value which is subtracted from the SystemTime to give local time.

If, for example, your server is located in the beautiful city of Saint Paul,
Minnesota, it would be in the US Central timezone26 which is six hours west of
UTC. The value in the ServerTimeZone field would, therefore be 360 minutes.
(Except, of course, during the summer when Daylight Savings Time is in effect in
which case it would be 300 minutes.) On the other hand if your server is in
Moscow in the winter, the ServerTimeZone value would be  -180.

The basic rule of thumb:

> LocalTime = SystemTime - ( ServerTimeZone × 600000000 )

...which returns local time in units of 10-7seconds, based on January 1601 as
described above.

If you found all of that to be complicated, you will be relieved to know that
this is only one of many different time formats used in SMB. Time And Date
Encoding is covered in section 3.7 of the SNIA doc.



EncryptionKeyLength This is the last field in the
NEGOTIATE_PROTOCOL_RESPONSE.SMB_PARAMETERS block. It provides the length, in
bytes, of the Challenge used in Challenge/Response authentication. SMB
Challenges, if present, are always 8 bytes long, so the EncryptionKeyLength will
have a value of either 8 or 0--the latter if Challenge/Response authentication
is not in use.

The name of this field is probably a hold-over from some previous enhancement to
the protocol--still in use for "historical reasons".



The job's not done
until the paperwork's finished.
-- Lavatory Axiom   

Wow... a lot of stuff there. No time to sit and chat about it right now, though.
We still need to finish out the of the NEGOTIATE_PROTOCOL_RESPONSE.SMB_DATA
block.

2.6.3.2 NEGPROT RESPONSE DATA

SMB_DATA, of course, is handed to us as an array of bytes with the length
provided in the ByteCount field. The parsing of those bytes depends upon the
values in the SMB_PARAMETER block that we just examined. The structure is
completely different depending upon whether Extended Security has been
negotiated.

Here is what it looks like, more or less, in the NT LM 0.12 dialect:

    typedef struct
      {
      ushort ByteCount;          /* Number of bytes to follow. */
      union
        {
        struct
          {
          uchar GUID[16];        /* 16-byte Globally Unique ID */
          uchar SecurityBlob[];  /* Auth-system dependent      */
          } ext_sec;             /* Extended Security          */
        struct
          {
          uchar EncryptionKey[]; /* 0 or 8 bytes long          */
          uchar DomainName[];    /* nul-terminated string      */
          } non_ext_sec;         /* Non-Extended Security      */
        } Bytes;
      } smb_NegProt_Rsp_Data;

The first thing to note is that this SMB_DATA.Bytes block structure is the union
of two smaller structures:



 * ext_sec is used if Extended Security has been negotiated,



 * non_ext_sec is used otherwise.

The second thing to note is that this is pseudo-code, not valid C code. Some of
the array lengths are unspecified because we don't know the byte-length of the
fields ahead of time. In real code, you will probably need to use pointers or
some other mechanism to extract the variable-length data from the buffer.

Okay, let's chop that structure into little bits...



GUID GUID stands for Globally Unique IDentifier. The GUID field is always 16
bytes long.

As of this writing, research by Samba Team members shows that this value is
probably the same as the GUID identifier used by Active Directory to keep track
of servers in the database. Stand-alone servers (which are not listed in any
Active Directory) also generate and use a GUID. Go figure.

Though this field is only present when Extended Security is enabled, it is not,
strictly speaking, a security field. The value is well known and easily forged.
It is not clear (yet) why this field is even sent to the client. In testing, a
Samba server was configured to fill the GUID field with its own 16-byte Server
Service NetBIOS name...and that worked just fine.



SecurityBlob The SecurityBlob is--as the name says--a blob of security
information. In other words, it is a block of data that contains authentication
information particular to the Extended Security mechanism being used. Obviously,
this field will need to be covered in the Authentication section.

The SecurityBlob is variable in length. Fortunately, the GUID field is always 16
bytes, so the length of the SecurityBlob is (ByteCount - 16) bytes.



EncryptionKey This field should be called Challenge because that's what it
actually contains--the Challenge used in Challenge/Response authentication. The
SMB Challenge, if present, is always eight bytes long. If plaintext passwords
are in use then there is no Challenge, the EncryptionKey will be empty, and the
SMB_PARAMETERS.EncryptionKeyLength field will contain 0.



DomainName This field sometimes contains the NetBIOS name of the Workgroup or NT
Domain to which the server belongs. (We have talked a bit, in previous sections,
about Workgroups and NT Domains so the terms should be somewhat familiar.) In
testing, Samba servers always provided a name in the DomainName field. Windows
systems less reliably so. Windows98, for example, would sometimes provide a
value and sometimes not27.

The SNIA doc calls this field the OEMDomainName and claims that the characters
will be eight-bit values using the OEM character set of the server (that's the
7-bit ASCII character set augmented by an extended DOS code page which defines
characters for the upper 128 octet values). In fact, this field may contain
either a string of 8-bit OEM characters or a Unicode string with 16-bit
characters. The value of SMB_HEADER.FLAGS2.SMB_FLAGS2_UNICODE_STRINGS will let
you know how to read the DomainName field.


2.6.4 ARE WE THERE YET?

Okay, let's be honest... Ripping apart that NEGOTIATE PROTOCOL RESPONSE SMB was
about as exciting as the epic saga of undercooked toast. It doesn't get any
better than that, though, and there's a lot more of it. Implementing SMB is a
game of patience and persistence. It also helps if you get a cheap thrill from
fiddly little details. (Just don't go parsing your packets in public or people
will look at you funny.)

It seems, too, that our overview of the SMB Header and the NEGOTIATE PROTOCOL
exchange has left a bit of a mess on the floor. We have pulled a lot of concepts
off of the shelves and out of the closets, and we will need to do some sorting
and organizing before we can put them back. Let's see what we've got:



 * Opportunistic Locks (OpLocks), which were taking up space in the
   SMB_HEADER.FLAGS field,
 * Virtual Circuits (we found these in the box labeled MaxNumberVCs),
 * The Capabilities bits (and pieces),
 * Distributed File System (DFS), which spilled out when FLAGS2 fell open,
 * Character Encoding--which seems to get into everything, sort of like cat hair
   and dust,
 * Extended vs. DOS Attributes,
 * Long vs. short names, and...
 * Authentication, including plaintext passwords, Challenge/Response, Extended
   Security, and Packet Signing.

The only way to approach all of these topics is one-at-a-time. ...but first,
take another break. Every now and then, it is a good idea to stop and think
about what has been covered so far. This is one of those times. We have finished
tearing apart SMB headers and the body of a NEGOTIATE PROTOCOL message. That
should provide some familiarity with the overall structure of SMBs. Try doing
some packet captures, or skim through the SNIA CIFS Technical Reference. It
should all begin to make a little more sense now than it did when we started.



--------------------------------------------------------------------------------


2.7 SESSION SETUP

Originally, the SESSION SETUP was not required by--or even defined as part
of--the SMB protocol. It was introduced in the LANMAN days in order to handle
User Level authentication and could be skipped if the server was in Share Level
security mode. These days, however, the SESSION SETUP takes care of a lot of
unfinished business, like cleaning up some of the debris left by the NEGOTIATE
PROTOCOL RESPONSE. In the NT LM 0.12 dialect there must be a SESSION SETUP
exchange before a TREE CONNECT may be sent, even if the server is operating in
Share Level security mode.


2.7.1 SESSION SETUP ANDX REQUEST PARAMETERS

The SESSION SETUP SMB is actually a SESSION SETUP ANDX, which simply means that
there's an AndX block in the parameter section. In the NT LM 0.12 dialect, the
Parameter block is formatted as shown below:

    typedef struct
      {
      uchar WordCount;  /* 12 or 13 words */
      struct
        {
        struct
          {
          uchar  Command;
          uchar  Reserved;
          ushort Offset;
          } AndX;
        ushort MaxBufferSize;
        ushort MaxMpxCount;
        ushort VcNumber;
        ulong  SessionKey;
        ushort Lengths[];  /* 1 or 2 elements */
        ulong  Reserved;
        ulong  Capabilities;
        } Words;
      } smb_SessSetupAndX_Req_Params;

When looking at these C-like structures, keep in mind that they are intended as
descriptions rather than specifications. On the wire, the parameters are packed
tightly into the SMB messages, and they are not aligned. Though the structures
show the type and on-the-wire ordering of the fields, the C programming language
does not guarantee that the layout will be retained in memory. That's why our
example code includes all of those functions and macros for packing and
unpacking the packets28.

Many of the fields in the SESSION_SETUP_ANDX.SMB_PARAMETERS block should be
familiar from the NEGOTIATE PROTOCOL RESPONSE SMB. This time, though, it's the
client's turn to set the limits.



MaxBufferSize MaxBufferSize is the size (in bytes) of the largest message that
the client can receive. It is typically less than or equal to the server's
MaxBufferSize, but it doesn't need to be.



MaxMpxCount This must always be less than or equal to the server-specified
MaxMpxCount. This is the client's way of letting the server know how many
outstanding requests it will allow. The server might use this value to
pre-allocate resources.



VcNumber This field is used to establish a Virtual Circuit (VC) with the server.
Keep reading, we're almost there...



SessionKey Just echo back whatever you got in the NEGOTIATE PROTOCOL RESPONSE.



Lengths For efficiency's sake the structure above provides the Lengths field,
defined as an array of unsigned short integers and described as having one or
two elements. The SNIA doc and other references go to a lot more trouble and
provide two separate and complete versions of the entire SESSION SETUP REQUEST
structure.

Basically, though, if Extended Security has been negotiated then the Lengths
field is a single ushort, known as SecurityBlobLength in the SNIA doc. (We
touched on the concept of security blobs briefly back in section 2.6.3.2.) If
Extended Security is not in use then there will be two ushort fields identified
by the excessively long names:



 * CaseInsensitivePasswordLength and
 * CaseSensitivePasswordLength.

Obviously, all of this stuff falls into the general category of authentication,
and will be covered in more detail when we finally focus on that topic.



Reserved Four bytes of must-be-zero.



Capabilities This field contains the client capabilities flag bits.

You might notice, upon careful examination, that the client does not send back a
MaxRawSize value. That's because it can specify raw read/write sizes in the
SMB_COM_RAW_READ and SMB_COM_RAW_WRITE requests, if it sends them. These SMBs
are considered obsolete, so newer clients really shouldn't be using them.

There are a couple of fields in the SESSION SETUP REQUEST which touch on
esoteric concepts that we have been promising to explain for quite a while
now--specifically virtual circuits and capabilities--so let's get it over
with...

2.7.1.1 VIRTUAL CIRCUITS

It does seem as though there's a good deal of cruft in the SMB protocol. The
SessionKey, for example, appears to be a vestigial organ, the purpose of which
has been mostly forgotten. Originally, such fields may have been intended to
compensate for a limitation in a specific transport or an older implementation,
or to solve some other problem that isn't a problem any more.

Consider virtual circuits...

The LAN Manager documentation available from Microsoft's ftp site provides the
best clues regarding virtual circuits (see SMB-LM1X.PS, for instance). According
to those docs a virtual circuit (VC) represents a single transport layer
connection, and the VcNumber is a tag used to identify a specific transport link
between a specific client/server pair.

That concept probably needs to be considered in context.

The LANMAN dialects were developed in conjunction with OS/2 (an
honest-to-goodness, really-truly, multitasking OS). OS/2 clients pass SMB
traffic through a redirector--just like DOS and Windows--and it seems as though
there was some concern that multiplexing the SMB traffic from several processes
across a single connection might cause a bit of a bottleneck. So, to avoid
congestion, the redirector could create additional connections to facilitate
faster transfers for individual processes29. Under this scheme, all of the
transport level connections from a client to a server were considered part of a
single logical "session" (we now, officially, have way too many meanings for
that term). Within that logical session there could, conversely, be multiple
transport level connections--aka. virtual circuits--up to the limit set in the
NEGOTIATE PROTOCOL RESPONSE.

[Figure 2.10]

Figure 2.10 illustrates the point, and here's how it's supposed to work:



 * Logical Session Creation
   * The client makes an initial connection to the SMB server, performs the
     NEGOTIATE PROTOCOL exchange, and establishes the session by sending a
     SESSION SETUP ANDX REQUEST.
   * The VcNumber in the initial SESSION SETUP ANDX REQUEST is zero (0).



 * Additional VC Creation
   * An additional transport level connection is created.
   * The client sends a new SESSION SETUP ANDX REQUEST with a VcNumber greater
     than zero, but less than the MaxNumberVCs sent by the server.
   * The SessionKey field in the SESSION SETUP ANDX REQUEST must match the
     SessionKey returned in the initial NEGOTIATE PROTOCOL RESPONSE. That's how
     the new VC is bound to the existing logical session.

Ah-Hahhh! The mystery of the SessionKey field is finally revealed. Kind of a
let-down, isn't it?

Whenever a new transport-layer connection is created, the client is supposed to
assign a new VC number. Note that the VcNumber on the initial connection is
expected to be zero to indicate that the client is starting from scratch and is
creating a new logical session. If an additional VC is given a VcNumber of zero,
the server may assume that any existing connections with that same client are
now bogus, and shut them down.

Why do such a thing?
 

There is a finite amount of clue
in the Universe...
and the Universe is expanding.
-- Unknown
    (thanks to John Ladwig
    and Marcus Ranum)   

The explanation given in the LANMAN documentation, the Leach/Naik IETF draft,
and the SNIA doc is that clients may crash and reboot without first closing
their connections. The zero VcNumber is the client's signal to the server to
clean up old connections. Reasonable or not, that's the logic behind it.
Unfortunately, it turns out that there are some annoying side-effects that
result from this behavior. It is possible, for example, for one rogue
application to completely disrupt SMB filesharing on a system simply by sending
Session Setup requests with a zero VcNumber. Connecting to a server through a
NAT (Network Address Translation) gateway is also problematic, since the NAT
makes multiple clients appear to be a single client by placing them all behind
the same IP address30.

The biggest problem with virtual circuits, however, is that they are not really
needed any more (if, in fact, they ever were). As a result, they are handled
inconsistently by various implementations and are not entirely to be trusted. On
the client-side, the best thing to do is to ignore the concept and view each
transport connection as a separate logical session, one VC per session. Oh!
...and contrary to the specs the client should always use a VcNumber of one,
never zero.

On the server side, it is important to keep in mind that the TID, UID, PID, and
MID are all supposed to be relative to the VC. In particular, TID and UID values
negotiated on one VC have no meaning (and no authority) on another VC, even if
both VCs appear to be from the same client. Another important note is that the
server should not disconnect existing VCs upon receipt of a new VC with a zero
VcNumber. As described above, doing so is impractical and may break things. The
server should let the transport layer detect and report session disconnects. At
most, a zero VcNumber might be a good excuse to send a keep-alive packet.

The whole VC thing probably seemed like a good idea at the time.

2.7.1.2 CAPABILITIES BITS

Remember a little while back when we said that there were subtle variations
within SMB dialects? Well, some of them are not all that subtle once you get to
know them. The Capabilities bits formalize several such variations by letting
the client and server negotiate which special features will be supported. The
server sends its Capabilities field in the NEGOTIATE PROTOCOL RESPONSE, and the
client returns its own set of capabilities in the SESSION SETUP ANDX REQUEST.

The table below provides a listing of the capabilities defined for servers. The
client set is smaller.
 


 
 
 
 
 * <---- Tribble
 . <---- Tribble.gz
-- Karen Swanberg   



Server Capabilities Bit # Name / Bitmask Description 31  CAP_EXTENDED_SECURITY
0x80000000 Set to indicate that Extended Security exchanges are supported. 30 
CAP_COMPRESSED_DATA
0x40000000 If set, it indicates that the server can compress Data blocks before
sending them31. This might be useful to improve throughput of large file
transfers over low-bandwidth links. This capability requires that the
CAP_BULK_TRANSFER capability also be set. Currently, however, there are no known
implementations that support bulk transfer. 29  CAP_BULK_TRANSFER
0x20000000 If set, the server supports the SMB_COM_READ_BULK and
SMB_COM_WRITE_BULK SMBs.

There are no known implementations which support CAP_BULK_TRANSFER and/or
CAP_COMPRESSED_DATA. Samba does not even bother to define constants for these
capabilities.

23  CAP_UNIX
0x00800000 Microsoft reserved this bit based on a proposal (by Byron Deadwiler
at Hewlett-Packard) for a small set of Unix extensions. The SNIA doc describes
these extensions in an appendix. Note, however, that the proposal was made and
the appendix written before the extensions were widely implemented. Samba
supports the SMB Unix extensions, but probably not exactly as specified in the
SNIA doc. 15  CAP_LARGE_WRITEX
0x00008000 If set, the server supports a special mode of the SMB_COM_WRITE_ANDX
SMB which allows the client to send more data than would normally fit into the
server's receive buffers, up to a maximum of 64 Kbytes. 14  CAP_LARGE_READX
0x00004000 Similar to the CAP_LARGE_WRITEX, this bit indicates whether the
server can handle SMB_COM_READ_ANDX requests for blocks of data larger than the
reported maximum buffer size. The theoretical maximum is 64 Kbytes, but the
client should never request more data than it can receive. 13 
CAP_INFOLEVEL_PASSTHROUGH
0x00002000 Samba calls this the CAP_W2K_SMBS bit. In testing NT4 systems did not
set this bit, but W2K systems did. Basically, it indicates support for some
advanced requests. 12  CAP_DFS
0x00001000 If set, this bit indicates that the server supports Microsoft's
Distributed File System. 9  CAP_NT_FIND
0x00000200 This is a mystery bit. There is very little documentation about it
and what does exist is not particularly helpful. The SNIA doc simply says that
this bit is "Reserved", but the notes regarding the CAP_NT_SMBS bit state that
the latter implies the former. (Counter-examples have been found in some
references, but not on the wire during testing. Your mileage may vary.)

Basically, though, if this bit is set it indicates that the server supports an
extended set of function calls belonging to a class of calls known as
"transactions".

8  CAP_LOCK_AND_READ
0x00000100 If set, the server is reporting that it supports the obsolete
SMB_COM_LOCK_AND_READ SMB.

...but go back and look at the SMB_HEADER.FLAGS bits described earlier. The
lowest order FLAGS bit is SMB_FLAGS_SUPPORT_LOCKREAD, and it is also supposed to
indicate whether or not the server supports SMB_COM_LOCK_AND_READ (as well as
the complimentary SMB_COM_WRITE_AND_UNLOCK). The thing is, traces from WindowsNT
and Windows2000 systems show the CAP_LOCK_AND_READ bit set while the
SMB_FLAGS_SUPPORT_LOCKREAD is clear.

That doesn't make a lot of sense.

Well... it may be that the server is indicating that it supports the
SMB_COM_LOCK_AND_READ SMB but not the SMB_COM_WRITE_AND_UNLOCK SMB, or it may be
that the server may be using the Capabilities field in preference to the FLAGS
field.

Avoid the use of the SMB_COM_LOCK_AND_READ and SMB_COM_WRITE_AND_UNLOCK SMBs and
everything should turn out alright.

7  CAP_LEVEL_II_OPLOCKS
0x00000080 If set, Level II OpLocks are supported in addition to Exclusive and
Batch OpLocks. 6  CAP_STATUS32
0x00000040 If set, indicates that the server supports the 32-bit NT_STATUS error
codes. 5  CAP_RPC_REMOTE_APIS
0x00000020 If set, this bit indicates that the server permits remote management
via Remote Procedure Call (RPC) requests. RPC is way beyond the scope of this
book. 4  CAP_NT_SMBS
0x00000010 If set, this bit indicates that the server supports some advanced
SMBs that were designed for use with WindowsNT and above. These are,
essentially, an extension to the NT LM 0.12 dialect.

According to the SNIA doc, the CAP_NT_SMBS implies CAP_NT_FIND.

3  CAP_LARGE_FILES
0x00000008 If set, this bit indicates that the server can handle 64-bit file
sizes. With 32-bit file sizes, files are limited to 4GB in size. 2  CAP_UNICODE
0x00000004 Set to indicate that the server supports Unicode. 1  CAP_MPX_MODE
0x00000002 If set, the server supports the (obsolete) SMB_COM_READ_MPX and
SMB_COM_WRITE_MPX SMBs. 0  CAP_RAW_MODE
0x00000001 If set, the server supports the (obsolete) SMB_COM_READ_RAW and
SMB_COM_WRITE_RAW SMBs.

On the server side, the implementor's rule of thumb regarding capabilities is to
start by supporting as few as possible and add new ones one at a time. Each bit
is a cornucopia--or Pandora's box--of new features and requirements, and most
represent a very large development effort. As usual, if there is documentation
it is generally either scarce or encumbered.

Things are not quite so bad if you are implementing a client, though the client
also has a list of capabilities that it can declare. The client list is as
follows:



Client Capabilities Bit # Name / Bitmask Description 31  CAP_EXTENDED_SECURITY
0x80000000 Set to indicate that Extended Security exchanges are supported.

The SNIA doc and the older IETF Draft do not list this as a capability set by
the client. On the wire, however, it is clearly used as such by Windows, Samba,
and by Steve French's CIFS VFS for Linux. If the server indicates Extended
Security support in its Capabilities field, then the client may set this bit to
indicate that it also supports Extended Security.

9  CAP_NT_FIND
0x00000200 If set, it indicates that the client is capable of utilizing the
CAP_NT_FIND capability of the server. 7  CAP_LEVEL_II_OPLOCKS
0x00000080 If set, this bit indicates that the client understands Level II
OpLocks. 6  CAP_STATUS32
0x00000040 Indicates that the client understands 32-bit NT_STATUS error codes.
4  CAP_NT_SMBS
0x00000010 Likewise, I'm sure.

As with the CAP_NT_FIND bit, the client will set this to let the server know
that it, too, understands the extended set of SMBs and function calls that are
available if the server has set the CAP_NT_SMBS bit.

3  CAP_LARGE_FILES
0x00000008 The client sets this to let the server know that it can handle 64-bit
file sizes and offsets. 2  CAP_UNICODE
0x00000004 Set to indicate that the client understands Unicode.

The client should not set any bits that were not also set by the server. That
is, the Capabilities bits sent to the server should be the intersection (bitwise
AND) of the client's actual capabilities and the set sent by the server.

The Capabilities bits are like the razor-sharp barbs on a government fence.
Attempting to hurdle any one of them can shred your implementation. Consider
adding Unicode support to a system that doesn't already have it. Ooof! That's
going to be a lot of work32.

Some Capabilities bits indicate support for sets of function calls that can be
made via SMB. These function calls, which are sometimes referred to as
"sub-protocols", fall into two separate (but similar) categories:



 * Remote Administration Protocol (RAP)
 * Remote Procedure Call (RPC)

Of the two, the RAP sub-protocol is older and (relatively speaking) simpler.
Depending upon the SMB dialect, server support for some RAP calls is assumed
rather than negotiated. Fortunately, much of RAP is documented...if you know
where to look33.

Microsoft's RPC system--known as MS-RPC--is newer, and has a lot in common with
the better-known DCE/RPC system. MS-RPC over SMB allows the client to make calls
to certain Windows DLL library functions on the server side which, in turn,
allows the client to do all sorts of interesting things. Of course, if you are
building a server and you want to support the MS-RPC calls you have to implement
all of the required functions in addition to SMB itself. Unfortunately, much of
MS-RPC is undocumented34.

The MS-RPC function call APIs are defined using a language called Microsoft
Interface Definition Language (MIDL). There is a fair amount of information
about MIDL available on the web and some of the function interface definitions
have been published. CIFS implementors have repeatedly asked Microsoft for open
access to all of the CIFS-relevant MIDL source files. Unencumbered access to the
MIDL source would go a long way towards opening up the CIFS protocol suite.
Since MIDL provides only the interface specifications and not the function
internals, Microsoft could release them without exposing their proprietary DLL
source code.

Both the RAP and MS-RPC sub-protocols provide access to a large set of features,
and both are too big to be covered in detail here. Complete documentation of all
of the nooks and crannies of CIFS would probably require a set of books large
enough to cause an encyclopedia to cringe in awe, so it would seem that our
attempt to clean up the mess we made with the NEGOTIATE PROTOCOL exchange has
instead created an even bigger mess and left some permanent stains on the
carpet. Ah, well. Such is the nature of CIFS.


2.7.2 SESSION SETUP ANDX REQUEST DATA

The dissection of the SMB_PARAMETERS portion of the SESSION SETUP ANDX REQUEST
cleared up a few issues and exposed a few others. Now we get to look at the
SMB_DATA block and see what further mysteries may lie uncovered.
 

...just another piece of
useless information.
-- Rain on the Hills,
Judie Tzuke   

Fortunately, the Data block is much less daunting. It contains a few fields used
for authentication and the rest is just useful bits of information about the
client's operating environment. The structure looks like this:

    typedef struct
      {
      ushort ByteCount;
      struct
        {
        union
          {
          uchar SecurityBlob[];
          struct
            {
            uchar CaseInsensitivePassword[];
            uchar CaseSensitivePassword[];
            uchar Pad[];
            uchar AccountName[];
            uchar PrimaryDomain[];
            } non_ext_sec;
          } auth_stuff;
        uchar NativeOS[];
        uchar NativeLanMan[];
        uchar Pad2[];
        } Bytes;
      } smb_SessSetupAndx_Req_Data;



auth_stuff As you may by now have come to expect, the structure of the
auth_stuff field depends upon whether or not Extended Security has been
negotiated. We have shown it as a union type just to emphasize the point. Under
Extended Security, the blob will contain a structure specific to the type of
Extended Security being used. The SecurityBlobLength value in the Parameter
block indicates the size (in bytes) of the SecurityBlob.

If Extended Security has not been negotiated, the structure will contain the
following fields:



CaseInsensitivePassword and CaseSensitivePassword If these names seem familiar
it's because the associated length fields were in the Parameter block, described
above. These fields are, of course, used in authentication. Section 2.8 covers
authentication in detail.



Pad If Unicode is in use, then the Pad field will contain a single nul byte
(0x00) to force two-byte alignment of the following fields (which are Unicode
strings).

As you know, the Parameter block is made up of a single byte followed by an
array of zero or more words. It starts on a word boundary, but the WordCount
byte knocks it off balance, so it never ends on a word boundary. That means that
the Data block always starts misaligned35. Typically, that's not considered a
problem for data in SMB messages. It is not clear why, but it seems that when
Unicode support was added to SMB it was decided that Unicode strings should be
word-aligned within the SMB message (even though they are likely to be copied
out of the message before they're fiddled). That's why the Pad byte is there.

Note that if Unicode support is enabled the password fields will always contain
an even number of bytes. Strange but true. Here's why:



 * On Windows server systems, plaintext passwords and Unicode are mutually
   exclusive. The password hashes used for authentication are always an even
   number of bytes.
 * Unlike Windows, Samba can be configured to use plaintext passwords and
   Unicode. In that configuration, the CaseInsensitivePassword field will be
   empty and the CaseSensitivePassword field will contain the password in
   Unicode format--two bytes per character.

Note the subtle glitch here. If Samba is configured to send Unicode plaintext
passwords, the CaseSensitivePassword field will not be word-aligned because the
Pad byte comes afterward. It seems that the designers of the NT LM 0.12 dialect
did not consider the possibility of plaintext Unicode passwords.



Who is your user?
-- Tron    AccountName This is the username field. If Unicode has been
negotiated, then the username is presented in Unicode. Otherwise, the string is
converted to upper-case and sent using the 8-bit OEM character set.



PrimaryDomain As with the AccountName, this value is converted to upper-case
unless it is being sent in Unicode format.

Whenever possible, this field should contain the NetBIOS name of the NT Domain
to which the user belongs. Basically, it allows the client to specify the NT
Domain in which the username and password are valid--the Authentication Domain.
A correct value is not always needed, however. If the server is not a member of
an NT Domain, then it will have its own authentication database, and no Domain
Controller need be consulted.

Some testing was done with WindowsNT4 and Windows2000 systems that were not
members of an NT Domain. As clients, these systems sent their own NetBIOS
machine name in the PrimaryDomain field. The smbclient utility sent the
workgroup name, as specified in the smb.conf file. jCIFS just sent a question
mark. All of these variations seem to work, as long as the server maintains its
own authentication database. The PrimaryDomain field is really only useful when
authenticating against a Domain Controller.

...and that's the end of the auth_stuff block. On to the rest of it.



NativeOS This string identifies the host operating system. Windows systems, of
course, will fill this field with their OS name and some revision information.
This field will be expressed in Unicode if that format has been negotiated.



NativeLanMan Similar to the NativeOS field, this one contains a short
description of the client SMB software. Smbclient fills this field with the name
"Samba". jCIFS used to just say "foo" here, but starting with release
0.7.0beta10 it says "jCIFS". The successful use of "foo" demonstrates, however,
that the field is not used for anything critical on the server side. Just error
reporting, most likely.



email

--------------------------------------------------------------------------------

From: Gerald (Jerry) Carter To: Chris Hertel Subject: NativeLanMan

Note that NT4 misaligns the NativeLanMan string by one byte (see Ethereal for
details). Also note that Samba uses this string to distinguish between
W2K/XP/2K3 for the %a smb.conf variable. So it is used by the server in some
cases.
 





Pad2 Some systems add one or two extra nul bytes at the end of the SESSION
SETUP. Not all clients do this; it appears to be more common if Unicode has been
negotiated. The extra bytes pad the end of the SESSION SETUP to the next word
boundary. If these bytes are present, they are generally included in the total
count given in the ByteCount field.

We have done a lot of work ripping apart packet structures and studying the
internal organs. Don't worry, that's the last of it. You should be familiar
enough with this stuff by now, so from here on out we will rely on the SNIA doc
and packet traces to provide the gory details.



Don't Know When to Quit Alert:


--------------------------------------------------------------------------------

Some of the Windows systems that were tested did not place the correct number of
nul bytes at the ends of some Unicode strings. Consider, for example, this
snippet from an Ethereal capture:



0000029F                           57 00 69 00 6e 00 64 00          W.i.n.d.
000002AF  6f 00 77 00 73 00 20 00  4e 00 54 00 20 00 31 00 o.w.s. . N.T. .1.
000002BF  33 00 38 00 31 00 00 00  00 00 57 00 69 00 6e 00 3.8.1... ..W.i.n.
000002CF  64 00 6f 00 77 00 73 00  20 00 4e 00 54 00 20 00 d.o.w.s.  .N.T. .
000002DF  34 00 2e 00 30 00 00 00  00 00                   4...0... ..

Look closely, and you will see that there are two extra nul bytes following each
of the two Unicode strings in the hex dump. Under UCS-2LE encoding, the nul
string terminator would be encoded as two nul bytes (00 00). In the sample
above, however, there are four null bytes (00 00 00 00) following the last
Unicode character of each string.

In this next excerpt, taken from a SESSION SETUP ANDX RESPONSE SMB, it appears
as though one of the terminating nul bytes at the end of the PrimaryDomain field
has been lost:



0000008F                                             57 00                W.
0000009F  69 00 6e 00 64 00 6f 00  77 00 73 00 20 00 35 00 i.n.d.o. w.s. .5.
000000AF  2e 00 30 00 00 00 57 00  69 00 6e 00 64 00 6f 00 ..0...W. i.n.d.o.
000000BF  77 00 73 00 20 00 32 00  30 00 30 00 30 00 20 00 w.s. .2. 0.0.0. .
000000CF  4c 00 41 00 4e 00 20 00  4d 00 61 00 6e 00 61 00 L.A.N. . M.a.n.a.
000000DF  67 00 65 00 72 00 00 00  55 00 42 00 49 00 51 00 g.e.r... U.B.I.Q.
000000EF  58 00 00                                         X..

The first two bytes of the last line (58 00) are the letter 'X' in UCS-2LE
encoding. They should be followed by two nul bytes...but there's only one.
 




2.7.3 THE SESSION SETUP ANDX RESPONSE SMB

The SESSION SETUP ANDX RESPONSE SMB structure is described in section 4.1.2 of
the SNIA doc.

In the NT LM 0.12 dialect, there are two versions of the SESSION SETUP ANDX
RESPONSE message. They differ, of course, based on whether or not Extended
Security is in use. In the Extended Security version the Parameter block has a
SecurityBlobLength field, and there is an associated SecurityBlob within the
Data block. These two fields are missing from the non-Extended Security version.
Other than that, the two are the same.

The SESSION SETUP ANDX RESPONSE message also has an interesting little bitfield
called SMB_PARAMETERS.Action. Only the low-order bit (bit 0) of this field is
defined. If set, it indicates that the username was not recognized by the server
(that is, authentication failed--no such user) but the logon is being allowed to
succeed anyway.

That's rather odd, eh?

What it means is this: If the username (in the AccountName field) is not
recognized, the server may choose to grant anonymous or guest authorization
instead. Anonymous access typically provides only very limited access to the
server. For example, it may allow the use of a limited set of RAP function calls
such as those used for querying the Browse Service.

So, the Action bit is used to indicate that the logon attempt failed, but
anonymous access was granted instead. No error code will be returned in this
case, so the Action bit is the only indication to the client that the rules have
changed. Server-side support for this behavior is optional.



--------------------------------------------------------------------------------


2.8 AUTHENTICATION

Now for the big one...

If you are familiar with authentication schemes then this section should be
comfortable for you. If not, then perhaps it's time for a fresh pot of tea. Some
people find their first experience with the innards of password security to be a
bit intimidating, possibly because the encryption formulae are sometimes made to
look a lot like mathematics. Authentication itself isn't really that complex,
though. The basic idea is that the would-be user needs to prove that they are
who they say they are in order to get what they want. The proof is usually in
the form of something private or secret--something that only the user has or
knows.
 

Car locks are there
to keep the honest people honest.
-- Something my brother Robert
once told me. (He sells cars.)

  

Consider, for example, the key to an automobile (something you have). With the
key in hand, you are able to unlock the door, turn the ignition switch, and
start the engine. As far as the car is concerned, you have proven that you have
the right to drive. Likewise with the password you use to access your computer
(something you know). If you enter a valid username/password pair at the login
prompt, then you can access the system. Unfortunately passwords, like keys, can
be stolen or forged or copied. Just as locks can be picked, so passwords can be
cracked36.

In the early days of SMB, when the LANs were small and sheltered, there was very
little concern for the safety of the password itself. It was sent in plaintext
(un-encrypted) over the wire from the client to the server. Eventually, though,
corporate networks got bigger, modems were installed to provide access from home
and on the road, the "disgruntled employee" boogeyman learned how to use a
keyboard, and everything got connected to the Internet. These were hard times
for plaintext passwords, so a series of schemes was developed to keep the
passwords safe--each more complex than its predecessor.

For SMB, the initial attempt was called LAN Manager Challenge/Response
authentication, often simply abbreviated "LM". The LM scheme turned out to be
too simple and too easy to crack, and was replaced with something stronger
called WindowsNT Challenge/Response (known as "NTLM"). NTLM was superseded by
NTLMv2 which has, in turn, been replaced with a modified version of MIT's
Kerberos system.

Got that?

We'll go through them all in various degrees of detail. The LM algorithm is
fairly simple, so we can provide a thorough description. At the other extreme,
Kerberos is an entire system unto itself and anything more than an overview
would be overkill.


2.8.1 ANONYMOUS AND GUEST LOGIN

Gather and study piles of SMB packet captures and you will notice that some
SESSION SETUP requests contain no username and password at all. These are
anonymous logins, and they are used to access special-purpose SMB shares such as
the hidden "IPC$" share (the Inter-Process Communications share). You can learn
more about IPC$ in the Browsing section. Put simply, though, this share allows
one system to query another using RAP function calls.

Anonymous login may be a design artifact; something created in the days of Share
Level security when it seemed safe to leave a share unprotected, and still with
us today because it cannot easily be removed. Maybe not. One guess is as good as
another.

"GUEST" account logons are also often sent sans password. The guest login is
sometimes used in the same way as the anonymous login, but there are additional
permissions which a guest account may have. Guest accounts are maintained like
other "normal" accounts, so they can be a security problem and are commonly
disabled. When SMB is doing its housekeeping, the anonymous login is generally
preferred over the guest login.


2.8.2 PLAINTEXT PASSWORDS

This is the easiest SMB authentication mechanism to implement--and the least
secure. It's roughly equivalent to leaving your keys in the door lock after
you've parked the car. Sure, the car is locked, but...

Plaintext passwords may still be sufficient for use in small, isolated networks,
such as home networks or small office environments (assuming no disgruntled
employees and a well-configured firewall on the uplink--or no Internet
connection at all). Plaintext passwords also provide us with a nice opportunity
to get our feet wet in the mired pool of authentication. We can look at the
packets and clearly see what is happening on the wire. Note, however, that many
newer clients are configured to prevent the use of plaintext. Windows clients
have registry entries that must be twiddled in order to permit plaintext
passwords, and jCIFS did not support them at all until version 0.7.

In order to set up a workable test environment you will need a server that does
not expect encrypted passwords, and a client that doesn't mind sending the
passwords in the clear. That is not an easy combination to come by. Most
contemporary SMB clients and servers disable plaintext by default. It is easy,
however, to configure Samba so that it requests unencrypted passwords. Just
change the encrypt passwords parameter to no in the smb.conf file, like so:

    ; Disable encrypted passwords.
    encrypt passwords = no

Don't forget to signal smbd to reload the configuration file after making this
change.

On the client side we will, once again, use the jCIFS Exists utility in our
examples. If you would rather use a Windows client for your own tests, you can
find a collection of helpful registry settings in the docs/Registry/
subdirectory of the Samba distribution. You will probably need to change the
registry settings to permit the Windows client to send plaintext passwords.
Another option as a testing tool is Samba's smbclient utility, which does not
seem to argue if the server tells it not to encrypt the passwords.

This is what our updated Exists test looks like:



 shell


  $ java -DdisablePlainTextPasswords=false Exists \  
  > smb://pat:p%40ssw0rd@smedley/home
  smb://pat:p%40ssw0rd@smedley/home exists
  $
  

A few things to note:



 * The -DdisablePlainTextPasswords=false command-line option tells jCIFS that it
   should permit the use of plaintext.



 * The username and password are passed to jCIFS via the SMB URL. The syntax is
   fairly common for URLs37. Basically, it looks like this:
   
   
   
   > smb://[[user[:password]@]host[:port]]
   
   The username in our example is pat.



 * The password in our example is p@ssw0rd, but the '@' in the password
   conflicts with the '@' used to separate the userinfo field from the hostport
   field38. To resolve the conflict we encode the '@' in p@ssw0rd using the URL
   escape sequence "%40", which gives us p%40ssw0rd.



 * If at all possible, applications should be written to request the password in
   a more secure fashion, and to hide it once it has been given. The [:password]
   syntax is not part of the general URL syntax definition, and its use is
   highly discouraged. Having the password display on the screen is as naughty
   as sending it across the wire in plaintext.

2.8.2.1 USER LEVEL SECURITY WITH PLAINTEXT PASSWORDS

User and Share Level security were described back in section 2.5.4, along with
the TID and [V]UID header fields. The SecurityMode field of the NEGOTIATE
PROTOCOL RESPONSE SMB will indicate the authentication expectations of the
server. For User Level plaintext passwords, the value of the SecurityMode field
will be 0x01.

Below is an example SESSION_SETUP_ANDX.SMB_DATA block such as would be generated
by the jCIFS Exists tool. Note, once again, that the discussion is focused on
the NT LM 0.12 dialect.

     SMB_DATA
      {
      ByteCount = 27
      Bytes
        {
        CaseInsensitivePassword = "p@ssw0rd"
        CaseSensitivePassword   = <NULL>
        Pad                     = <NULL>
        AccountName             = "PAT"
        PrimaryDomain           = "?"
        NativeOS                = "Linux"
        NativeLanMan            = "jCIFS"
        }
      }

There are always fiddly little details to consider when working with SMB. In
this case, we need to talk about upper- and lower-case. (bLeCH.) The example
above shows that the AccountName field has been converted to upper-case. This is
common practice, but it is not really necessary and some implementations don't
bother. It is a holdover from the early days of SMB when lots of things
(filenames, passwords, share names, NetBIOS names, bagels, and pop singers) were
converted to upper-case as a matter of course. Some older servers
(pre-NT LM 0.12) may require upper-case usernames, but newer servers shouldn't
care. Converting to upper-case is probably the safest option, just in case...

Although the AccountName in the example is upper-case, the
CaseInsensitivePassword is not. Hmmm... Odd, eh? The situation here is that some
server operating systems (eg. most Unixy OSes) use case-sensitive password
verification algorithms. If the password is sent all upper-case it probably
won't match what the OS expects, resulting in a login failure even though the
user entered the correct password. The field may be labeled case-insensitive
(and that really is what it is intended to be) but some server OSes prefer to
have the original password, case preserved, just as the user entered it.
 

A tradition is a mistake
that you have made
more than once.
-- Stephanie Cohen   

This is a sticky problem, though, because some clients insist on converting
passwords to upper-case before sending them to the server. Windows95 and '98 may
do this, for example. As you might have come to expect by now, the reason for
this odd behavior is backward compatibility. There are older (pre-NT LM 0.12)
servers still running that will reject passwords that are not all upper-case.
Windows9x systems solve the problem by forcing all passwords to upper-case even
when the NT LM 0.12 dialect has been selected. Samba's smbd server, which
generally runs on case-sensitive platforms, must go through a variety of
contortions to get upper-case plaintext passwords to be accepted39.

Another annoyance is that Windows98 will pad the plaintext password string to 24
bytes, filling the empty space with semi-random garbage. This behavior was noted
in testing, but there wasn't time to investigate the problem in-depth so it may
or may not be wide-spread. Still, it's the odd case that will break things.
Server implementors should be careful to both check the field length and look
for the first terminating nul byte when reading the plaintext password.

In short, client-side handling of the plaintext CaseInsensitivePassword is
inconsistent and problematic--and the server has to compensate. That's why you
need piles of SMB packet captures and lots of different clients to test against
when writing a server implementation. It can be done, but it takes a bit of
perseverance. When writing a new client, ensure that the client sends the
password as the user intended. If that fails, and the dialect is pre-NT LM 0.12,
then convert to upper-case and try again. Believe it or not, the use of
challenge/response authentication bypasses much of this trouble.

...but that's only half the story. In addition to the CaseInsensitivePassword
field there is also a CaseSensitivePassword field in the data block, and we
haven't even touched on that yet. This latter field is only used if Unicode has
been negotiated, and it is rare that both Unicode and plaintext will be used
simultaneously. It can happen, though. As mentioned earlier, Samba can be easily
configured to provide support for Unicode plaintext passwords40. In theory, this
should be a simple switch from ASCII to Unicode. In practice, no client really
supports it yet--and weird things have been seen on the wire. For example:



 * Clients disagree on the length of the Unicode password string in
   CaseSensitivePassword. Some count the pair of nul bytes that terminate the
   string, others do not. (For comparison, the length of the ASCII
   CaseInsensitivePassword string does include the terminating nul, so it seems
   there is precedent.)



 * In testing, more than one client stored the length of the Unicode password in
   the CaseInsensitivePasswordLength field... but that's where the ASCII
   password length is supposed to go. The Unicode password length should be in
   the CaseSensitivePasswordLength field. How should the server interpret the
   password in this situation--as ASCII or Unicode?



 * One client added a nul byte at the beginning of the Unicode password string,
   probably intended as a padding byte to force word alignment. The extra nul
   byte was being read as the first byte of the CaseSensitivePassword, thus
   misaligning the Unicode string. Another client went further and counted the
   extra byte in the total length of the Unicode password string. As a result,
   the password length was given as an odd number of bytes (which should never
   happen).

Empirically, it would seem that Unicode plaintext passwords were never meant to
be.

An interesting fact-ette that can be gleaned from this discussion is that there
is a linkage between the password fields and the negotiation of Unicode. Simply
put:



ASCII (OEM character set)  <==>  CaseInsensitivePassword Unicode  <==> 
CaseSensitivePassword

That is, ASCII plaintext passwords are stored in the CaseInsensitivePassword
field, and Unicode plaintext passwords should be placed into the
CaseSensitivePassword field. Indeed, Ethereal names these two fields,
respectively, as "ANSI Password" and "Unicode Password" instead of using the
longer names shown above. This relationship carries over to the
challenge/response passwords as well, as we shall soon see.

2.8.2.2 SHARE LEVEL SECURITY WITH PLAINTEXT PASSWORDS

We won't spend too much time on this. It is easy to see by looking at packet
captures. Basically, in Share Level security mode the plaintext password is
passed to the server in the TREE CONNECT ANDX request instead of the SESSION
SETUP ANDX. In the NT LM 0.12 dialect, however, a valid username should also be
placed into the SESSION SETUP AccountName field if at all possible. Doing so
allows the server to map Share Level security to its own user-based
authentication system.



Interesting Implementation Alert:


--------------------------------------------------------------------------------

Samba does not completely implement Share Level security. Though all of the
required SMBs are supported, Samba does not provide any way to assign a password
to a share.

Many SMB clients will provide a username (if one is available) in the SESSION
SETUP ANDX SMB even though it is not (technically) required at Share Level. If
there is no username available, however, Samba will attempt (through various
methods--some of which might be considered kludgey) to guess an appropriate
username for the connection. Read through the smb.conf(5) manual page if you are
interested in the details.
 




2.8.3 LM CHALLENGE/RESPONSE

In plaintext mode, the client proves that it knows the password by sending the
password itself to the server. In challenge/response mode, the goal is to prove
that the password is known without risking any exposure. It's a bit of a magic
trick. Here's how it's done:



 1. The server generates a random string of bytes--random enough that it is not
    likely to come up again in a very, very long time (thus preventing replay
    attacks). This string is called the challenge.



 1. The challenge is sent to the client.



 1. Both the client and server encrypt the challenge using a key that is derived
    from the user's password. The client sends its result back to the server.
    This is the response.



 1. If the client's response matches the server's result, then the server knows
    (beyond a reasonable doubt) that the client knows the correct key. If they
    don't match, authentication fails.

[Figure 2.11]

That's a rough, general overview of challenge/response. The details of its use
in LAN Manager authentication are a bit more involved, but are fairly easy to
explain. As we dig deeper, keep in mind that the goal is to protect the password
while still allowing authentication to occur. Also remember that LM
challenge/response was the first attempt to add encrypted password support to
SMB.

2.8.3.1 DES

The formula used to generate the LM response makes use of the U.S. Department of
Commerce Data Encryption Standard (DES) function, in block-mode. DES has been
around a long time. There are a lot of references which describe it and a good
number of implementations available, so we will not spend a whole lot of time
studying DES itself41. For our purposes, the important thing to know is that the
DES function--as used with SMB--takes two input parameters and returns a result,
like so:



> result = DES( key, source );

The source and result are both eight-byte blocks of data, the result being the
DES encryption of the source. In the SNIA doc, as in the Leach/Naik draft, the
key is described as being seven bytes (56 bits) long. Documentation on DES
itself gives the length of the key as eight bytes (64 bits), but each byte
contains a parity bit so there really are only 56 bits worth of "key" in the
64-bit key. As shown in figure 2.12, there is a simple formula for mapping 56
bits into the required 64-bit format. The seven byte string is simply copied,
seven bits at a time, into an eight byte array. A parity bit (odd parity) is
inserted following each set of seven bits (but some existing DES implementations
use zero and ignore the parity bit).

[Figure 2.12]

The key is used by the DES algorithm to encrypt the source. Given the same key
and source, DES will always return the same result.

2.8.3.2 CREATING THE CHALLENGE

The challenge needs to be very random, otherwise the logon process could be made
vulnerable to "replay" attacks.

A replay attack is fairly straight-forward. The attacker captures the exchange
between the server and the client and keeps track of the challenge, the
response, and the username. The attacker then tries to log on, hoping that the
challenge will be repeated (this step is easier if the challenge is at all
predictable). If the server sends a challenge that is in the stored list, the
attacker can use the recorded username and response to fake a logon. No password
cracking required.

Given that the challenge is eight bytes (64 bits) long, and that random number
generators are pretty good these days, it is probably best to create the
challenge using a random number function. The better the random number
generator, the lower the likelihood (approaching 1 in 264) that a particular
challenge will be repeated.

The X/Open doc (which was written a long time ago) briefly describes a different
approach to creating the challenge. According to that document, a seven-byte
pseudo-random number is generated using an internal counter and the system time.
That value is then used as the key in a call to DES(), like so:

    Ckey = fn( time( NULL ), counter++ );
    challenge = DES( Ckey, "????????" );

(...the source string is honest-to-goodnessly given as eight question marks.)

That formula actually makes a bit of sense, though it's probably overkill. The
pseudo-random Ckey is non-repeating (because it's based on the time), so the
resulting challenge is likely to be non-repeating as well. Also note that the
pseudo-random value is passed as the key, not the source, in the call to DES().
That makes it much more difficult to reverse and, since it changes all the time,
reversing it is probably not useful anyway.
 

Anybody remotely interesting
is mad, in some way or another.
--Doctor Who   

As Andrew Bartlett42 points out, however, the time and counter inputs are easily
guessed so the challenge is predictable, which is a potential weakness. Adding a
byte or two of truly random "salt" to the Ckey in the recipe above would prevent
such predictability.



email

--------------------------------------------------------------------------------

From: Andrew Bartlett To: Chris Hertel Subject: SMB Challenge...

Actually, given comments I've read on some SMB cracking sites, it would not
surprise me if MS still does (or at least did) use exactly this for the
challenges.

I still think you should address the X/Open function not as 'overkill' but as
'flawed'.
 



Using a plain random number generator is probably faster, easier, and safer.

2.8.3.3 CREATING THE LM HASH

LM challenge/response authentication prevents password theft by ensuring that
the plaintext password is never transmitted across a network or stored on disk.
Instead, a separate value known as the "LM Hash" is generated. It is the LM Hash
that is stored on the server side for use in authentication, and used on the
client side to create the response from the challenge.

The LM Hash is a sixteen byte string, created as follows:



 1. The password, as entered by the user, is either padded with nuls (0x00) or
    trimmed to fourteen (14) bytes43.
    * Note that the 14-byte password string is not handled as a nul-terminated
      string. If the user enters 14 or more bytes, the last byte in the modified
      string will not be nul.
    * Also note that the password is given in the 8-bit OEM character set
      (extended ASCII), not Unicode.



 1. The 14-byte password string is converted to all upper-case.



 1. The upper-case 14-byte password string is chopped into two 7-byte keys.



 1. The seven-byte keys are each used to DES-encrypt the string constant
    "KGS!@#$%", which is known as the "magic" string44.



 1. The two 8-byte results are then concatenated together to form the 16-byte LM
    Hash.

That outline would make a lot more sense as code, wouldn't it? Well, you're in
luck. Listing 2.6 shows how the steps given above might be implemented.

[Listing 2.6]

2.8.3.4 CREATING THE LM RESPONSE

Now we get to the actual logon. When a NEGOTIATE PROTOCOL REQUEST arrives from
the client, the server generates a new challenge on the fly and hands it back in
the NEGOTIATE PROTOCOL RESPONSE.

On the client side, the user is prompted for the password. The client generates
the LM Hash from the password, and then uses the hash to DES-encrypt the
challenge. Of course, it's not a straight-forward DES operation. As you may have
noticed, the LM Hash is 16 bytes but the DES() function requires 7-byte keys.
Ah, well... Looks as though there's a bit more padding and chopping to do.



 1. The password entered by the user is converted to a 16-byte LM Hash as
    described above.



 1. The LM Hash is padded with five nul bytes, resulting in a string that is 21
    bytes long.



 1. The 21 byte string is split into three 7-byte keys.



 1. The challenge is encrypted three times, once with each of the three keys
    derived from the LM Hash.



 1. The results are concatenated together, forming a 24-byte string which is
    returned to the server. This, of course, is the response.

Once again, we provide demonstrative code. Listing 2.7 shows how the LM Response
would be generated.

[Listing 2.7]

The server, which has the username and associated LM Hash tucked away safely in
its authentication database, also generates the 24-byte response string. When
the client's response arrives, the server compares its own value against the
client's. If they match, then the client has authenticated.

Under User Level security, the client sends its LM Response in the
SESSION_SETUP_ANDX.CaseInsensitivePassword field of the SESSION SETUP request
(yes, the LM response is in the SESSION SETUP REQUEST). With Share Level
security, the LM Response is placed in the TREE_CONNECT_ANDX.Password field.

2.8.3.5 LM CHALLENGE/RESPONSE: ONCE MORE WITH FEELING

The details sometimes obfuscate the concepts, and vice versa. We have presented
a general overview of the challenge/response mechanism, as well as the
particular formulae of the LAN Manager scheme. Let's go through it once again,
quickly, just to put the pieces together and cover anything that we may have
missed.



The LM Hash The LM Hash is derived from the password. It is used instead of the
password so that the latter won't be exposed. A copy of the LM Hash is stored on
the server (or Domain Controller) in the authentication database.

On the down side, the LM Hash is password equivalent. Because of the design in
the LM challenge/response mechanism, a cracker45 can use the LM Hash to break
into a system. The password itself is not, in fact, needed. Thus, the LM Hash
must be protected as if it were the password.



The Challenge If challenge/response is required by the server, the SecurityMode
field of the NEGOTIATE PROTOCOL RESPONSE will have bit 0x02 set, and the
challenge will be found in the EncryptionKey field. Challenge/response may be
used with either User Level or Share Level security.



The Logon On the client side, the user will--at some point--be prompted for a
password. The password is converted into the LM Hash. Meanwhile, the server (or
NT Domain Controller) has its own copy of the LM Hash, stored in the
authentication database. Both systems use the LM Hash to generate the LM
Response from the challenge.



The LM Response The client sends the LM Response to the server in either the
SESSION_SETUP_ANDX.CaseInsensitivePassword field or the
TREE_CONNECT_ANDX.Password field, depending upon the security level of the
server. The server compares the client's response against its own to see if they
match.



The SESSION SETUP ANDX RESPONSE To finish up, the server will send back a
SESSION SETUP ANDX RESPONSE. The STATUS field will indicate whether the logon
was successful or not.

Well, that's a lot of work and it certainly goes a long way towards looking
complicated. Unfortunately, looking complicated isn't enough to truly protect a
password. LM challenge/response is an improvement over plaintext, but there are
some problems with the formula and it turns out that it is not, in fact, a very
big improvement.
 

...and we'll have fun, fun, fun
'till somebody takes the
keyboard away.
-- Not quite as The
Beach Boys intended.   

Let's consider what an attacker might do to try and break into a system. We've
already explained the replay attack. Other common garden varieties include the
"dictionary" and the "brute force" attack, both of which simply try pushing
possible passwords through the algorithm until one of them returns the same
response seen on the wire. The dictionary attack is typically faster because it
uses a database of likely passwords, so tools tend to try this first. The brute
force method tries all (remaining) possible combinations of bytes, which is
usually a longer process. Unfortunately, all of the upper-casing, nul-padding,
chopping, and concatenating used in the LM algorithm makes LM challenge/response
very susceptible to these attacks. Here's why:

The LM Hash formula pads the original password with nul bytes. If the password
is short enough (seven or fewer characters) then, when the 14-byte padded
password is split into two seven-byte DES keys, the second key will always be a
string of seven nuls. Given the same input, DES produces the same output...



> 0xAAD3B435B51404EE = DES( "\0\0\0\0\0\0\0", "KGS!@#$%" )

...which results in an LM Hash in which the second set of eight bytes are known:



0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 result 0 result 1 ?? ?? ?? ?? ?? ?? ?? ??
AA D3 B4 35 B5 14 04 EE

To create the LM Response, the LM Hash is padded with nuls to 21 bytes, and then
split again into three DES keys:



0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 key 0 key 1 key 2 ?? ?? ??
?? ?? ?? ?? ?? AA D3 B4 35 B5 14 04 EE 00 00 00 00 00

...and now the problem is obvious. If the original password was seven bytes or
less, then almost two-thirds of the encryption key used to generate the LM
Response will be a known, constant value. The password cracking tools leverage
this information to reduce the size of the keyspace (the set of possible
passwords) that needs to be tested to find the password. Less obvious, but clear
enough if you study the LM Response algorithm closely, is that short passwords
are only part of the problem. Because the hash is created in pieces, it is
possible to attack the password in 7-byte chunks even if it is longer than 7
bytes.

Converting to upper-case also diminishes the keyspace, because lower-case
characters do not need to be tested at all. The smaller the keyspace, the faster
a dictionary or brute-force attack can run through the possible options and
discover the original password46.


2.8.4 NTLM CHALLENGE/RESPONSE

At some point in the evolution of WindowsNT a new, improved challenge/response
formula was introduced. It was similar to the LAN Manager version, with the
following changes:



 1. Instead of using the upper-case ASCII (OEM character set) password, NTLM
    challenge/response generates the hash from the mixed-case Unicode (UCS-2LE)
    representation of the password. This change alone makes the password much
    more difficult to crack.



 1. Instead of the DES() function, NTLM uses the MD4() message digest function
    described in RFC 1320. This function produces a 16-byte hash (the NTLM
    Hash47) but requires no padding or trimming of the input (though the
    resulting 16-byte NTLM Hash is still padded with nuls to 21 bytes for use in
    generating the NTLM Response.)



 1. The NTLM Response is sent to the server in the
    SESSION_SETUP_ANDX.CaseSensitivePassword field.

...and that's basically it. The rest of the formula is the same.

So what does it buy us?

The first advantage of NTLM is that the passwords are more complex. They're
mixed case and in Unicode, which means that the keyspace is much larger. The
second advantage over LM is that the MD4() function doesn't require fixed length
input. That means no padding bytes and no chopping to over-simplify the keys.
The NTLM Hash itself is more robust than the LM Hash, so the NTLM Response is
much more difficult to reverse.

Unfortunately, the NTLM Response is still created using the same algorithm as is
used with LM, which provides only 56-bit encryption. Worse, clients often
include both the NTLM Response and the LM Response (derived from the weaker LM
Hash) in the SESSION SETUP ANDX REQUEST. They do this to maintain
backward-compatibility with older servers. Even if the server refuses to accept
the LM Response, the client has sent it. Ouch.



Brain Overflow Alert:


--------------------------------------------------------------------------------

The next section describes the NTLMv2 algorithm. It's not really that difficult,
but it can get tedious--especially if your head is still swimming from the LM
and NTLM algorithms. Jerry Carter of the Samba Team warns that your brain may
explode if you try to understand it all the first time through. (Most veteran
CIFS engineers have had this happen at least twice.)

You may want to skim through section 2.8.5 and possibly section 2.8.9, which
describes Message Authentication Codes (MACs). You can always come back and read
them again after you've iced your cranium.
 






2.8.5 NTLM VERSION 2



There is a theory which states
that if ever anyone discovers
exactly what the Universe is for
and why it is here, it will
instantly disappear
and be replaced by something
even more bizarre
and inexplicable.
There is another which states
that this has already happened.
-- Douglas Adams
The Restaurant at the End of the Universe   

NTLMv2, as it's called, has some additional safeguards thrown into the recipe
that make it more complex--and hopefully more secure--than its predecessors.
There are, however, two small problems with NTLMv2:



 * Good documentation on the inner workings of NTLMv2 is rare.



 * Although it is widely available, NTLMv2 does not seem to be widely used.

Regarding the first point, Appendix B of Luke K. C. Leighton's book DCE/RPC over
SMB: Samba and Windows NT Domain Internals provides a recipe for NTLMv2
authentication. We'll do our best to expand on Luke's description. The other
option, of course, is to look at available Open Source code.

The second point is really conjecture, based in part on the fact that it took a
very long time to get NTLMv2 implemented in Samba and few seemed to care.
Indeed, NTLMv2 support had already been added to Samba-TNG by Luke and crew, and
needed only to be copied over. It seems that the delay in adding it to Samba was
not a question of know-how, but of priorities.

Another factor is that NTLMv2 is not required by default on most Windows
systems. When challenge/response is negotiated, even newer Windows versions will
default to using the LM/NTLM combination unless they are specifically configured
not to.

2.8.5.1 THE NTLMV2 TOOLBOX

We have already fussed with the DES algorithm and toyed with the MD4 algorithm.
Now we get to use the HMAC-MD5 Message Authentication Code hash. This one's a
power tool with razor-sharp keys and swivel-action hashing. The kind of thing
your Dad would never let you play with when you were a kid. Like all good tools,
though, it's neither complex nor dangerous once you learn how it works.

[Figure 2.13]

HMAC-MD5 is actually a combination of two different algorithms: HMAC and MD5.
HMAC is a Message Authentication Code (MAC) algorithm that takes a hashing
function (such as MD5) and adds a secret key to the works so that the resulting
hash can be used to verify the authenticity of the data. The MD5 algorithm is
basically an industrial-strength version of MD4. Put them together and you get
HMAC-MD5.

HMAC-MD5 is quite well documented48, and there are a lot of implementations
available. It's also much less complicated than it appears in figure 2.13, so we
won't need to go into any of the details. For our purposes, what you need to
know is that the HMAC_MD5() function takes a key and some source data as inputs,
and returns a 16-byte (128-bit) output.

Hmmm... Well, it's not actually quite that simple. See, MD4, MD5, and HMAC-MD5
all work with variable-length input, so they also need to know how big their
input parameters are. The function call winds up looking something like this:



> hash16 = HMAC_MD5( Key, KeySize, Data, DataSize );

There is, as it turns out, more than one way to skin an HMAC-MD5. Some
implementations use a whole set of functions to compute the result:



 * the first function accepts the key and creates an initial context,
 * the second function may be called repeatedly, each time passing the context
   and the next block of data,
 * and the final function is used to close the context and return the resulting
   hash.

Conceptually, though, the multi-function approach is the same as the simpler
example shown above. That is: Key and Data in, 16-byte hash out.



Not Quite Entirely Unlike Standard Alert:


--------------------------------------------------------------------------------

The HMAC-MD5 function can handle very large Key inputs. Internally, though,
there is a maximum keysize of 64 bytes. If the key is too long, the function
uses the MD5 hash of the key instead. In other words, inside the HMAC_MD5()
function there is some code that does this:

      if( KeySize > 64 )
        {
        Key = MD5( Key, KeySize );
        KeySize = 16;
        }

In his book, Luke explains that the function used by Windows systems is actually
a variation on HMAC-MD5 known as HMACT64, which can be quickly defined as
follows:

      #define HMACT64( K, Ks, D, Ds ) \
              HMAC_MD5( K, ((Ks > 64)?64:Ks), D, Ds )

In other words, the HMACT64() function is the same as HMAC_MD5() except that it
truncates the input Key to 64 bytes rather than hashing it down to 16 bytes
using the MD5() function as prescribed in the specification.

As you read on, you will probably notice that the keys used by the NTLMv2
challenge/response algorithm are never more that 16 bytes, so the distinction is
moot for our purposes. We bother to explain it only because HMACT64() may be
used elsewhere in CIFS (in some dark corner that we have not visited) and it
might be a useful tidbit of information for you to have.
 



Another important tool is the older NTLM hash algorithm. It was described
earlier but it is simple enough that we can present it again, this time in
pseudo-code:

    uchar *NTLMhash( uchar *password )
      {
      UniPasswd = UCS2LE( password );
      KeySize   = 2 * strlen( password );
      return( MD4( UniPasswd, KeySize ) );
      }

The ASCII password is converted to Unicode UCS-2LE format, which requires two
bytes per character. The KeySize is simply the length of that (Unicode) password
string, which we calculate here by doubling the ASCII string length (which is
probably cheating). Finally, we generate the MD4 hash (that's MD4, not MD5) of
the password, and that's all there is to it.

Note that the string terminator is not counted in the KeySize. That is common
behavior for NTLM and NTLMv2 challenge/response when working with Unicode
strings.

The NTLM Hash is of interest because the SMB/CIFS designers at Microsoft (if
indeed such people truly exist any more, except in legend) used it to cleverly
avoid upgrade problems. With LM and NTLM, the hash is created from the password.
Under NTLMv2, however, the older NTLM (v1) Hash is used instead of the password
to generate the new hash. A server or Domain Controller being upgraded to use
NTLMv2 may already have the older NTLM hash values in its authentication
database. The stored values can be used to generate the new hashes--no password
required. That avoids the nasty chicken-and-egg problem of trying to upgrade to
NTLMv2 Hashes on a system that only allows NTLMv2 authentication.

2.8.5.2 THE NTLMV2 PASSWORD HASH

The NTLMv2 Hash is created from:

 * the NTLM Hash (which, of course, is derived from the password),
 * the user's username, and
 * the name of the logon destination.

The process works as shown in the following pseudo-code example:

    v1hash  = NTLMhash( password );
    UniUser = UCS2LE( upcase( user ) );
    UniDest = UCS2LE( upcase( destination ) );
    data    = uni_strcat( UniUser, UniDest );
    datalen = 2 * (strlen( user ) + strlen( destination ));
    v2hash  = HMAC_MD5( v1hash, 16, data, datalen );

Let's clarify that, shall we?



v1hash  =  The NTLM Hash, calculated as described previously.
  UniUser  =  The username, converted to upper-case UCS-2LE Unicode.
  UniDest  =  The NetBIOS name of either the SMB server or NT Domain against
which the user is trying to authenticate.
  data  =  The two Unicode strings are concatenated and passed as the Data
parameter to the HMAC_MD5() function.
  datalen  =  The length of the concatenated Unicode strings, excluding the nul
termination. Once again, doubling the ASCII string lengths is probably cheating.
  v2hash  =  The NTLM Version 2 Hash.



A bit more explanation is required regarding the destination value (which gets
converted to UniDest).

In theory, the client can use NTLMv2 challenge/response to log into a
stand-alone server or to log into an NT Domain. In the former case, the server
will have an authentication database of its very own, but an NT Domain logon
requires authentication against the central database maintained by the Domain
Controllers.

So, in theory, the destination name could be either the NetBIOS name of the
stand-alone server or the NetBIOS name of the NT Domain (no NetBIOS suffix byte
in either case). In practice, however, the server logon doesn't seem to work
reliably. The Windows systems used in testing were unable to use NTLMv2
authentication with one another when they were in stand-alone mode, but once
they joined the NT Domain NTLMv2 logons worked just fine49.

2.8.5.3 THE NTLMV2 RESPONSE

The NTLMv2 Response is calculated using the NTLMv2 Hash as the Key. The Data
parameter is composed of the challenge plus a blob of data which we will refer
to as "the blob". The blob will be explained shortly. For now, just think of it
as a mostly-random bunch of garblement. The formula is shown in this pseudo-code
example:

    blob = RandomBytes( blobsize );
    data = concat( ServerChallenge, 8, blob, blobsize );
    hmac = HMAC_MD5( v2hash, 16, data, (8 + blobsize) );
    v2resp = concat( hmac, 16, blob, blobsize );

Okay, let's take a closer look at that and see if we can force it to make some
sense.



 1. The first step is blob generation. The blob is normally around 64 bytes in
    size, give or take a few bytes. The pseudo-code above suggests that the
    bytes are entirely random, but in practice there is a formula (explained
    below) for creating the blob.



 1. The next step is to append the blob to the end of the challenge. This, of
    course, is the same challenge sent by the server and used by all of the
    other challenge/response mechanisms.
    
    
    
    0 1 2 3 4 5 6 7 8 9 1
    0 1
    1 1
    2 . . . challenge blob...



 1. The challenge and blob are HMAC'd using the NTLMv2 Hash as the key.



 1. The NTLMv2 Response is created by appending the blob to the tail of the
    HMAC_MD5() result. That's 16 bytes of HMAC followed by blobsize bytes of
    blob.
    
    
    
    0 1 2 3 4 5 6 7 8 9 1
    0 1
    1 1
    2 1
    3 1
    4 1
    5 1
    6 1
    7 1
    8 1
    9 2
    0 . . . hmac blob...

If the client sends the NTLMv2 Response, it will take the place of the NTLM
Response in the SESSION_SETUP_ANDX.CaseSensitivePassword field. Note that,
unlike the older NTLM Response, the NTLMv2 Response algorithm uses 128-bit
encryption all the way through.

2.8.5.4 CREATING THE BLOB

If you have ever taken a college-level Invertebrate Zoology course, you may find
the dissection of the blob to be nauseatingly familiar. The rest of you... try
not to be squeamish. One more warning before we cut into this: The blob's
structure may not matter at all. We'll explain why a little later on.

Okay, now that the disclaimers are out of the way, we can get back to work. The
blob does have a structure, which is more or less as follows:



4 bytes:  The value seen in testing is consistently 0x01010000. (Note that those
are nibbles, not bits.) The field is broken out as follows:



1 byte:  Response type identification number. The only known value is 0x01.
1 byte:  The identification number of the maximum response type that the client
understands. Again, the only known value is 0x01. 2 bytes:  Reserved. Must be
zero (0x0000).


 



4 bytes:  The value seen in testing is always 0x00000000. This field may,
however, be reserved for some purpose.
  8 bytes:  A timestamp, in the same 64-bit format described back in section
2.6.3.1.
  8 bytes:  The "blip": An eight-byte random value, sometimes referred to as the
"Client Challenge". More on this later, when we talk about LMv2
challenge/response.
  4 bytes:  Unknown.
Comments in the Samba-TNG code and other sources suggest that this is meant to
be either a 4-byte field or a pair of 2-byte fields. These fields should contain
offsets to other data. That interpretation is probably based on empirical
observation, but in the testing done for this book there was no pattern to the
data in these fields. It may be that some implementations provide offsets and
others just fill this space with left-over buffer garbage. Variety is the spice
of life.
  variable length:  A list of structures containing NetBIOS names in Unicode.
  4 bytes:  Unknown. (Appears to be more buffer garbage.)

The list of names near the end of the blob may contain the NT Domain and/or the
server name. As with the names used to generate the NTLMv2 Hash, these are
NetBIOS names in upper-case UCS-2LE Unicode with no string termination and no
suffix byte. The name list also has a structure:



2 bytes:  Name type.

0x0000  indicates the end of the list. 0x0001  indicates that the name is a
NetBIOS machine name (eg. a server name). 0x0002  indicates that the name is an
NT Domain NetBIOS name. 0x0003  the server's DNS hostname. 0x0004  a W2K Domain
name (a DNS name).


  2 bytes:  The length, in bytes, of the name. If the name type is 0x0000, then
this field will also be 0x0000.
  variable length:  The name, in upper-case UCS-2LE Unicode format.

The blob structure is probably related to (the same as?) data formats used in
the more advanced security systems available under Extended Security50.

2.8.5.5 IMPROVED SECURITY THROUGH CONFUSION



...they have weapons
of mass confusion
and aren't afraid to use them.
-- iomud on Slashdot   

Now that we have the formula worked out, let's take a closer look at the NTLMv2
challenge/response algorithm and see how much better it is than NTLM.

With the exception of the password itself, all of the inputs to NTLMv2 are known
or knowable from a packet capture. Even the blob can be read off the wire, since
it is sent as part of the response. That means that the problem is still a
not-so-simple case of solving for a single variable: the password.

The NTLMv2 Hash is derived directly from the NTLM (v1) Hash. Since there is no
change to the initial input (the password) the keyspace is exactly the same. The
only change is that the increased complexity of the algorithm means that there
are more encryption hoops through which to jump than in the simpler NTLM
process. It takes more computer time to generate a v2 response, which doesn't
impact a normal login but will slow down dictionary and brute force attacks
against NTLMv2 (though Moore's Law may compensate). Weak passwords (those that
are near the beginning of the password dictionary) are still vulnerable.

Another thing to consider is the blob. If the blob were zero length (empty), the
NTLMv2 Response formula would reduce to:



> v2resp = HMAC_MD5( v2hash, ServerChallenge );

...which would still be pretty darn secure. So the question is this: Does the
inclusion of the blob improve the NTLMv2 algorithm and, if so, how?

Well, see, it's like this... Instead of being produced by the key and challenge
alone, the NTLMv2 Response involves the hash of a chunk of semi-random data. As
a result, the same challenge will not always generate the same response. That's
good, because it prevents replay attacks...in theory.

In practice, the randomness of the challenge should be enough to prevent replay
attacks. Even if that were not the case, the only way that the blob could help
would be if it, too, were non-repeating and if the server could somehow verify
that the blob was not a repeat. That, quite possibly, is why the timestamp is
included.

The timestamp could be used to let the server know that the blob is "fresh".
That is, that it was created a reasonably short amount of time before it was
received. Fresh packets can't easily be forged because the response is
HMAC-signed using the v2hash as the key (and that's based on the password which
is the very thing the cracker doesn't know). Of course, the timestamp test won't
work unless the client and server clocks are synchronized, which is not always
the case.

In all likelihood the contents of the blob are never tested at all. There is
code and commentary in the Samba-TNG source that shows that they have done some
testing, and that their results indicate that a completely random blob of bytes
works just fine. If that's true, then the blob does little to improve the
security of the algorithm except perhaps by adding a few more CPU cycles to the
processing time.



Bottom line:  NTLMv2 challenge/response provides only a minimal improvement over
its predecessor.





...it is a tale
Told by an idiot, full of sound
and fury, signifying nothing.
-- Macbeth, Act V, Scene v,
William Shakespeare   

This isn't the first time that we have put a lot of effort into figuring out
some complex piece of the protocol only to discover that it's almost pointless,
and it probably won't be the last time either.



email

--------------------------------------------------------------------------------

From: Ronald Tschalär To: Chris Hertel Subject: The point of client nonces

In section 2.8.5.5 you talk about the "client challenge" a bit, but miss the
point of it: the client nonce (as it should really more correctly be called) is
there to prevent precomputed dictionary attacks by the server, and has nothing
to do with replay attacks against the server (which, as you correctly state, is
what the server challenge is for).

If there's no client nonce, then a rogue server can pick a fixed server nonce
(server challenge), take dictionary, and precompute all the responses. Then any
time a client connects to it it sends the fixed challenge, and upon receipt of
the client's response it can do a simple database lookup to find the password
(assuming the password was in the dictionary). However, if the client adds its
own bit of random stuff to the response computation, then this attack (by the
server) is not possible. Hence the client nonce.

Even with client nonces a rogue server can still try to use a dictionary to
figure out your password, but the server has to run the complete dictionary on
each response, instead of being able to precompute and use the results for all
responses.
 



2.8.5.6 INSULT TO INJURY: LMV2

There is yet one more small problem with the NTLMv2 Response, and that problem
is known as pass-through authentication. Simply put, a server can pass the
authentication process through to an NT Domain Controller. The trouble is that
some servers that use pass-through assume that the response string is only 24
bytes long.

You may recall that both the LM and NTLM responses are, in fact, 24 bytes long.
Because of the blob, however, the NTLMv2 response is much longer. If a server
truncates the response to 24 bytes before forwarding it to the NT Domain
Controller almost all of the blob will be lost. Without the blob, the Domain
Controller will have no way to verify the response so authentication will fail.

To compensate, a simpler response--known as the LMv2 response--is also
calculated and returned alongside the NTLMv2 response. The formula is identical
to that of NTLMv2, except that the blob is really small.

    blip = RandomBytes( 8 );
    data = concat( ServerChallenge, 8, blip, 8 );
    hmac = HMAC_MD5( v2hash, 16, data, 16 );
    LMv2resp = concat( hmac, 16, blip, 8 );

The "blip", as we've chosen to call it, is sometimes referred to as the "Client
Challenge". If you go back and look, you'll find that the blip value is also
included in the blob, just after the timestamp. It is fairly easy to spot in
packet captures. The blip is 8 bytes long so that the resulting LMv2 Response
will be 24 bytes, exactly the number needed for pass-through authentication.

If it is true that the contents of the blob are not checked, then the LMv2
Response isn't really any less secure than the NTLMv2 Response--even though the
latter is bigger.

The LMv2 Response takes the place of the LM Response in the
SESSION_SETUP_ANDX.CaseInsensitivePassword field.

2.8.5.7 CHOOSING NTLMV2

The use of NTLMv2 is not negotiated between the client and server. There is
nothing in the protocol to determine which challenge/response algorithms should
be used.

So, um... how does the client know what to send, and how does the server know
what to expect?

The default behavior for Windows clients is to send the LM and NTLM responses,
and the default for Windows servers is to accept them. Changing these defaults
requires fiddling in the Windows registry. Fortunately, the fiddles are well
known and documented so we can go through them quickly and get them out of the
way51.

The registry path to look at is:



> HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\LSA

On Win9x the variable is called LMCompatibility, but on WindowsNT and 2000 it is
LMCompatibilityLevel. That variable may not be present in the registry, so you
might have to add it. In general, it's best to follow Microsoft's instructions
when editing the registry52.

The settings for LMCompatibilityLevel are as follows:



Level Description Client Implications Server Implications 0 The Default LM and
NTLM responses are sent by the client. The server or Domain Controller will
compare the client's responses against the LM, NTLM, LMv2, and NTLMv2 responses.
Any valid response is acceptable. 1 NTLMv2 Session Security This level does
nothing to change the algorithm used to generate the response. Instead, at this
level and higher a feature called NTLMv2 Session Security is supported. Session
Security is only used with Extended Security, and must be negotiated between the
client and the server. Session Security is an advanced topic, and won't be
covered here. 2 NTLM Authentication The LM Response is not sent by the client.
Instead, the NTLM Response is sent in both password fields. Replacing the LM
Response with the NTLM Response facilitates pass-through authentication. Servers
need only hand the 24-byte contents of the
SESSION_SETUP_ANDX.CaseInsensitivePassword field along to the Domain Controller.
The server or Domain Controller will accept a valid LM, NTLM, LMv2, or NTLMv2
response. 3 NTLMv2 Authentication The client sends the LMv2 and NTLMv2 responses
in place of the older LM and NTLM values. The server or Domain Controller will
accept a valid LM, NTLM, LMv2, or NTLMv2 response. 4 NTLM Required The client
sends the LMv2 and NTLMv2 responses. At this level, the server (or Domain
Controller) will not check LM Responses. It will compare responses using the
NTLM, LMv2, and/or NTLMv2 algorithms. 5 NTLMv2 Required The client sends the
LMv2 and NTLMv2 responses. The server (or Domain Controller) will compare the
client's responses using the LMv2 and NTLMv2 algorithms only.

That's just a quick overview of the settings and their meanings. The important
points are these:



 * The password hash type is not negotiated on the wire, but determined by
   client and/or server configuration. If the client and server configurations
   are incompatible, authentication will fail.
 * The SMB server or Domain Controller may try several comparisons in order to
   determine whether or not a given response is valid.


2.8.6 EXTENDED SECURITY: THAT LIGHT AT THE END OF THE TUNNEL

Our discussion of SMB authentication mechanisms is winding down now. There are a
few more topics to be covered and a few others that will be carefully, but
purposefully, avoided. Extended Security falls somewhere in between. We will dip
our toes into its troubled waters, but we won't wade in too deep (or the
monsters might get us).

One reason for trepidation is that--as of this writing--Extended Security is
still an area of active research and development for the Samba Team and others.
Though much has been learned, and much has been implemented, the dark pools are
still being explored and the fine points are still being examined. Another
deterrent is that Extended Security represents a full set of sub-protocols--a
whole, vast world of possibilities to be explored ...some other day. As with
MS-RPC (which we touched on just long enough to get our fingers burned), the
topic is simply too large to cover here.

As suggested in figure 2.14, Extended Security makes use of nested protocols. Go
back to section 2.6.3.2 and take a look at the
NEGOTIATE_PROTOCOL_RESPONSE.SMB_DATA structure. Note that the
ext_sec.SecurityBlob field is nothing more than a block of bytes--and it's
what's inside that block that matters. If the client and server agree to use
Extended Security, then the whole NEGOTIATE PROTOCOL RESPONSE / SESSION SETUP
REQUEST business becomes a transport for the authentication protocol.

[Figure 2.14]

In some cases the security exchange may require several packets and a few round
trips to complete. When that happens, a single NEGOTIATE PROTOCOL RESPONSE /
SESSION SETUP REQUEST pair will not be sufficient to handle it all. The solution
to this dilemma is fairly simple: The server sends an error message to force the
client to send another SESSION SETUP REQUEST containing the next chunk of data.
 

The only spec I trust
is written in C.
-- Andrew Bartlett,
Samba Team   

The process is briefly (and incompletely) described in section 4.1.2 of the SNIA
doc as part of the discussion of the SESSION SETUP RESPONSE. Simply put, as long
as there are more Extended Security packets required, the server will reply to
the SESSION SETUP REQUEST by sending a NEGATIVE SESSION SETUP RESPONSE with an
NT_STATUS value of 0xC0000016 (which is known as
STATUS_MORE_PROCESSING_REQUIRED). The client then sends another SESSION SETUP
REQUEST containing the additional data. This continues until the authentication
protocol has completed.

There is no DOS error code equivalent for STATUS_MORE_PROCESSING_REQUIRED,
something we have already whined about in the "!Strange Behavior Alert" box back
in section 2.5.1. It seems that Extended Security expects that the client can
handle NT_STATUS codes, which may be a significant issue for anyone trying to
implement an SMB client53.

2.8.6.1 THE EXTENDED SECURITY AUTHENTICATION TOOLKIT

There are several different authentication protocols which may be carried within
the SecurityBlob. Those protocols, in turn, are built on top of a whole pile of
different languages and APIs and data transfer formats. The result is an
alphabet soup of acronyms. Here's a taste:
 


 
 
  ...mit Schlag.   



ASN.1:  Abstract Syntax Notation One

ASN.1 is a language used to define the structure and content of objects such as
data records and protocol messages. If you are not familiar with ASN.1, you
might think of it as a super-duper-hyper version of the typedef in C--only a lot
more powerful. ASN.1 was developed as part of the Open Systems Interconnection
(OSI) environment, and was originally used for writing specifications. More
recently, though, tools have been developed that will generate software from
ASN.1.

The development and promotion of the ASN.1 language is managed by the ASN.1
Consortium.
 

DER:  Distinguished Encoding Rules of ASN.1

DER is a set of rules for encoding and decoding ASN.1 data. It provides a
standard format for transport of data over a network so that the receiving end
can convert the data back into its correct ASN.1 format. DER is a specialized
form of a more general encoding known as BER (Basic Encoding Rules). DER is
designed to work well with security protocols, and is used for encoding Kerberos
and LDAP exchanges.
 

GSS-API:  Generic Security Service Application Program Interface

As the name suggests, GSS-API is a generic interface to a set of security
services. It makes it possible to write software that does not care what the
underlying security mechanisms actually are. GSS-API is documented in RFC 2078.
 

Kerberos:  "Kerberos" is a name, not an acronym.

Kerberos is the preferred authentication system for SMB over naked TCP/IP
transport. The use of Kerberos with CIFS is also tied in with GSS-API and
SPNEGO.
 

LDAP:  Lightweight Directory Access Protocol

Some folks at the University of Michigan realized that the DAP protocol (which
was designed as part of the the Open Systems Interconnection (OSI) environment
for use with the X.500 directory system) was just too big and hairy for
general-purpose use, so they came up with a "lightweight" version, known as
LDAP. LDAP was popularized in the mid-1990's, and support was included with
directory service implementations such as Novell's NDS (Novell Directory
Service). When Microsoft created their Active Directory system they followed
Novell's lead and added LDAP support as well.
 

MIDL:  Microsoft Interface Definition Language

MIDL is Microsoft's version of the Interface Definition Language (IDL). It is
used to specify the parameters to function calls, particularly function calls
made across a network--things like Remote Procedure Call (RPC). MIDL is also
used to define the interfaces to Microsoft DLL library functions.
 

MS-RPC:  Microsoft Remote Procedure Call

Paul Leach was one of the founders of Apollo Computer. At Apollo, he worked on a
system for distributed computing that eventually became the DCE/RPC system. When
Hewlett-Packard purchased Apollo, Leach went to Microsoft. That's probably why
the MS-RPC system is so remarkably similar to the DCE/RPC system.
 

NDR:  Network Data Representation

NDR is to DCE/RPC as DER (or BER) is to ASN.1. That is, NDR is an on-the-wire
encoding for parameters passed via RPC. When an MS-RPC call is made on the
client side, the parameters are converted into NDR format (marshalled) for
transmission over the network. On the server side, the NDR formatted data is
unmarshalled and passed into the called function. The process is then reversed
to return the results.
 

NTLMSSP:  NTLM Security Support Provider

It seems there are a few variations on the interpretation of the acronym. A
quick web search for "NTLMSSP" turns up NTLM Security Service Provider and NTLM
Secure Service Provider in addition to NTLM Security Support Provider. No
matter. They all amount to the same thing.

NTLMSSP is a Windows authentication service that is accessed in much the same
way as MS-RPC services. NTLMSSP authentication requests are formatted into a
record structure and converted to NDR format for transport to the NTLMSSP
authentication service provider. In addition to Extended Security, NTLMSSP
authentication shows up in lots of odd places. It is even used by Microsoft
Internet Explorer to authenticate HTTP (web) connections.
 

SPNEGO:  Simple, Protected Negotiation

Also known as "The Simple and Protected GSS-API Negotiation Mechanism", SPNEGO
is a protocol that underlies GSS-API. It is used to negotiate the security
mechanism to be used between two systems. SPNEGO is documented in RFC 2478.



Quite a list, eh?
 

I am friend to the undertow
I take you in, I don't let go
-- Undertow,
Suzanne Vega   

As you can see, there is a lot going on below the surface of Extended Security.
We could try diving into a few of the above topics, but the waters are deep and
the currents are strong and we would quickly be swept away. Out of necessity, we
will spend a little time talking about Kerberos, but we won't swim out too far
and we will be wearing a PFD (Personal Floatation Device--don'cha just love
acronyms?).


2.8.7 KERBEROS

As already stated, we won't be going into depth about Kerberos. There is a lot
of documentation available on the Internet and in print, so the wiser course is
to suggest some starting points for research. There are, of course, several
starting points presented in the References section of this very book. A good
place to get your feet wet is Bruce Schneier's Applied Cryptography, Second
Edition.

Kerberos version 5 is specified in RFC 1510, but this is CIFS we're talking
about. Microsoft has made a few "enhancements" to the standard. The best known
is probably the inclusion of a proprietary Privilege Access Certificate (PAC)
which carries Windows-specific authorization information. Microsoft heard a lot
of grumbling about the PAC, and in the end they did publish the information
required by third-party implementors. They even did so under acceptable
licensing terms (and the CIFS community sighed a collective sigh of relief). The
PAC information is available in a Microsoft Developer Network (MSDN) document
entitled Windows 2000 Authorization Data in Kerberos Tickets.

There are a lot of Kerberos-related RFCs. The interesting ones for our purposes
are:



 * RFC 1964, which provides information about the use of Kerberos with GSS-API
 * RFC 3244, which covers Microsoft's Kerberos password-set and password-change
   protocols

There is also (as of this writing) a set of Internet Drafts that cover Microsoft
Kerberos features, including a draft for Kerberos authentication over HTTP.

Finally, a web search for "Microsoft" and "Kerberos" will toss up an abundant
salad of opinions and references, both historical and contemporary. Where CIFS
is concerned, it seems that there is always either too little or too much
information. Microsoft-compatible Kerberos falls under the latter curse. There
is a lot of stuff out there, and it is easy to get overwhelmed. If you plan to
dive in, find a buddy. Don't swim alone.


2.8.8 RANDOM NOTES ON W2K AND NT DOMAIN AUTHENTICATION

We have been delicately dancing around the role of the Domain Controller in
authentication. It's time to face the music.

The concept is fairly simple: Take the password database that is normally kept
locally by a stand-alone server and move it to a central authority so that it
can be shared by multiple servers, then call the whole thing a "Domain". The
central authority that stores the shared database is, of course, the Domain
Controller. As shown in figure 2.15, the result is that the SMB fileserver must
now consult the Domain Controller when a user tries to access SMB services.

[Figure 2.15]

That general description applies to both NT and W2K Domains, even though the two
are implemented in very different ways. Windows2000 Domains are based on Active
Directory and Kerberos, while WindowsNT Domains make use of a Security Accounts
Manager (SAM) Database and MS-RPC.

Let's see what bits of wisdom we can pull out of the hat regarding these two
Domain systems...

2.8.8.1 A QUICK LOOK AT W2K DOMAINS

As with Microsoft's Kerberos implementation, there is probably too much
information available on this topic. A full description would also be very much
beyond the stated scope of this book. So, as briefly as possible, here are some
notes about W2K Domains and Domain Controllers:



 * Windows2000 Domains are based on the real thing: The Internet Domain Name
   Service (DNS). The DNS provides a hierarchical namespace, and W2K can take
   advantage of the DNS hierarchy to form collections of related W2K domains
   called "trees". Groups of separate W2K Domain trees are known as "forests".



 * W2K Domain Controllers run the Active Directory service. Active Directory
   (AD) is a database system that can be used to store all sorts of information,
   including user account data. The design of AD owes a lot to Novell's NDS
   architecture which, in turn, is based on OSI X.500.



 * Data stored in the Active Directory may be accessed using the LDAP protocol.



 * Microsoft's Kerberos implementation relies upon the data stored in the Active
   Directory. The two services are closely linked.

...and that barely begins to scratch the surface. CIFS client and server
participation in a W2K Domain requires Kerberos support, but does not require a
detailed understanding of Active Directory architecture. The points above are of
interest here primarily for comparison with the NT Domain system notes,
presented below.

2.8.8.2 A FEW NOTES ABOUT NT DOMAINS

In contrast to W2K Domains, NT Domains have the following features:



 * WindowsNT Domains are built upon NetBIOS. The NetBIOS namespace is flat, not
   hierarchical, so there is no natural way to build relationships among NT
   Domains. Conceptually, NT Domains are stand-alone.



 * NT authentication information is stored in the Security Accounts Manager
   (SAM) Database. The SAM is an extension of the WindowsNT Registry database,
   and it is accessed using a Windows DLL.



 * In an NT Domain, the shared SAM database is stored on the Domain Controller
   and may be accessed using RPC function calls. (Windows2000, of course, stores
   the SAM data in the Active Directory, but it can also respond to the RPC
   calls for compatibility with the NT Domain system.)

There are two mechanisms that an SMB Server can use to ask a Domain Controller
to validate a client logon attempt. These are known as pass-through and NetLogon
authentication. The NetLogon mechanism uses MS-RPC, so we won't cover it here
except to say that it provides a more intimate relationship between the SMB
server and the Domain Controller than does the pass-through mechanism. There are
several good sources for further reading listed in the References section. In
particular:



 * Start with the whitepaper More Than You Ever Wanted to Know about NT Login
   Authentication, by Philip Cox and Paul Hill. It provides a clear and succinct
   introduction to the WindowsNT authentication system.



 * Another good overview from a different perspective is provided in the
   whitepaper CIFS Authentication and Security by Bridget Allison (now Bridget
   Warwick).



 * ...and once you're read that you'll be ready for the more in-depth NetLogon
   coverage in Luke's Leighton's book.

Pass-through, in contrast to NetLogon, is really quite simple. It is also
documented in (yet) an(other) expired Leach/Naik IETF draft, titled CIFS Domain
Logon and Pass Through Authentication, which can be found on Microsoft's CIFS
FTP site (under the name cifslog.txt).

Basically, pass-through authentication is a man-in-the-middle mechanism. It goes
like this:



 * The client attempts to log on to the server, but the server has no SAM
   database so it, in turn, attempts to create an SMB session with the Domain
   Controller.



 * The server sends a NEGOTIATE PROTOCOL REQUEST to the DC. The DC returns a
   challenge which the server passes back to the client.



 * The client does the hard work and generates the various responses (LM, NTLM,
   etc.), which are sent to the server. The server simply passes them through to
   the DC in its own SESSION SETUP REQUEST.



 * If the DC returns a POSITIVE SESSION SETUP RESPONSE to the server, then the
   server will return a POSITIVE SESSION SETUP RESPONSE to the client. Likewise
   with a negative response.

It should be easy to capture an example of pass-through authentication using
your network sniffer. Windows9x systems (and possibly other Windows varieties)
do not support NetLogon so they always use the pass-through method if they are
part of an NT Domain. Samba can be configured to use either method.



Radical Rodent Alert:


--------------------------------------------------------------------------------

There is an obscure Windows SMB file transfer mode implemented by Windows98,
WindowsMe, and possibly other Windows flavors. This mode is known in the
community as "rabbit-pellet" mode, and it is triggered by various subtle
combinations of conditions. In testing, it appears as though delays in
pass-through authentication may be a factor.

In rabbit-pellet mode the client will send a file to the server in very small
chunks, somewhere between 512 and 1536 bytes each (give or take). The client
will wait for a reply to each write, and will also send a flush request between
every one or two writes. This slows down file transfers considerably.

The condition is rare, which is good because it's really annoying when it
happens. It's also bad because it has been a very difficult problem to track
down.
 



2.8.8.3 IT'S GOOD TO HAVE A BACKUP

In the NT Domain system, there is a single Domain Controller that is primarily
responsible for the maintenance of the domain's SAM database. This Domain
Controller is known as the (surprise) Primary Domain Controller (PDC).

The domain may also have zero or more Backup Domain Controllers (BDCs). The BDCs
keep read-only replicas of the PDC's SAM database. BDCs can be used for
authentication just as the PDC can, and if the PDC is accidentally thrown out of
a twelfth-story window into an active volcano, a BDC can be "promoted" to fill
the role of the dearly departed PDC.

Windows2000 Domains do things differently. They do not distinguish between
Primary and Backup DCs. Instead, Active Directory makes use of something called
"multimaster replication". Updates to any replica are propagated to all of the
other replicas, so there is no longer any need to specify one copy of the
database as the primary.

2.8.8.4 TRUST ME ON THIS

This is one of those concepts that we have to cover because--unless you're
already familiar with it--you'll read about it somewhere else and think to
yourself "What the heck is that all about?".

Somewhere back a few paragraphs it was stated that NT Domains are, conceptually,
stand-alone entities ...and so they are, but it is possible to introduce them to
one another and get them to cooperate. The agreements forged between the domains
are known as "Inter-Domain Trust Relationships".

Let's use an example to explain what this is all about.

Consider a large corporate organization with several divisions, departments,
committees, consultants, and such-like. In this corporation, the Business Units
Reassignment Planning Division runs the BURP_DIV domain, and the Displacement
Entry Department calls theirs the DISENTRY domain.

Now, let's say that the BURP_DIV folks need access to files stored on DISENTRY
servers (so they can move the files around a bit). One way to handle this would
be to create accounts for the BURP_DIV users in the DISENTRY domain. That would
cause a bit of a problem, however, because the BURP_DIV users would need two
accounts, one per domain. That is likely to result in things like passwords,
preferences, and web browser bookmarks getting a bit out of sync. Also, the
Benefits Reduction Committee will want to know why all of the BURP_DIV employees
are moonlighting in the DISENTRY department and how they could possibly be doing
two jobs at once. It could become quite a mess, resulting in the hiring of
dozens of consultants to ensure that the problem is properly ignored.

The better way to handle this situation is to create a trust relationship
between the DISENTRY and BURP_DIV domains. With inter-domain trust established,
the BURP_DIV folks can log on to DISENTRY servers using their BURP_DIV
credentials. As shown in figure 2.16, the DISENTRY Domain Controller will ask
the BURP_DIV Domain Controllers to validate the logon.

[Figure 2.16]

Note that, in the non-extended-security version of the SESSION SETUP REQUEST
message, there is a field called PrimaryDomain. This field identifies the NT
domain against which the client wishes to authenticate. That is, the
PrimaryDomain field should contain the name of the NT Domain to which the user
belongs.

Windows2000 domains also support trust relationships. This is useful for
creating trust between two separate W2K Domain trees, or between W2K Domains and
NT Domains.

The mechanisms used to support inter-domain trust are very advanced topics, and
won't be covered here.


2.8.9 RANDOM NOTES ON MESSAGE AUTHENTICATION CODES

Message Authentication Codes (MACs) are used to prevent "pickle-in-the-middle"
attacks (more commonly known as "man-in-the-middle" attacks 54). This form of
attack is simple to describe, but it can be difficult to pull off in practice
(though wireless LAN technology has the potential to make it much easier).
Figure 2.17 provides some visuals.

[Figure 2.17]

Generally speaking, in the pickle-in-the-middle attack an evil interloper allows
the "real" client to authenticate with the server and then assumes ownership of
the TCP/IP connection, thus bypassing the whole problem of needing to know the
password.

There are a number of ways to hijack the TCP session, but with SMB that step
isn't necessary. Instead, the evil interloper can simply impersonate the server
to fool the client. For instance, if the evil interloper is on the same IP
subnet as both the client and server (a B-mode network) then it can usurp the
server's name by responding to broadcast name queries sent by the client faster
than the server does. Server identity theft can also be accomplished by
"poisoning" the NBNS database (or, possibly, the DNS). That is, by somehow
forcing it to swallow false information. A simple way to do that is to register
the server's name--with the interloper's IP address--in the NBNS before the
server does (perhaps by registering the name while the server is down for
maintenance or something).
 

Proof by Familiarity:
Well, that looks familiar!
-- Jonathan Young, PhD.   

In any case, when the client tries to open an SMB session with the server it may
wind up talking to the evil interloper instead. The evil interloper will pass
the authentication request through to the real server, then pass the challenge
back to the client, and then pass the client's response to the server... and
that... um... um... um... that looks exactly like pass-through authentication.
In fact, the basic difference between pass-through authentication and this type
of attack is the ownership of the box that is relaying the authentication. If
the crackers control the box, consider it an attack.

This authentication stuff is fun, isn't it?

So, given a situation in which you are concerned about evil interlopers gaining
access to your network, you need a mechanism that allows the client and server
to prove to one another on an ongoing basis that they are the real client and
server. That's what the MACs are supposed to do.



Caveat Emptor Alert:


--------------------------------------------------------------------------------

As of this writing, SMB MAC signing is an active area of research for the Samba
Team.

The available documentation regarding Message Authentication Codes (MACs)
disagrees to some extent with empirical results. The information presented in
this section is the best currently available, derived from both the
documentation and the testing being done by the Samba Team. As such, it is
probably (but not necessarily) very close to reality. Doveryay, no proveryay.
 



2.8.9.1 GENERATING THE SESSION KEY

The server and client each generate a special key, known as the Session Key.
There are several potential uses for the Session Key, but we will only be
looking at its use in MAC signing.

The Session Key is derived from the password hash--something that only the
client and server should know. There are several hash types available: LM, NTLM,
LMv2, and NTLMv2. The hash chosen is probably the most advanced hash that the
two systems know they share. So, if the client sent an LM Response--but did not
send an NTLM Response--then the Session Key will be based on the LM Hash. The LM
Session Key is calculated as follows:

    char eightnuls[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
    LM_Session_Key = concat( LM_Hash, 8, eightnuls, 8 );

That is, take the first eight bytes of the LM Hash and add eight nul bytes to
the end for a total of 16 bytes. Note that the resulting Session Key is not the
same as the LM Hash itself. As stated earlier, the password hashes can be used
to perform all of the authentication functions we have covered so far, so they
must be protected as if they were the actual password. Overwriting the last
eight bytes of the hash with zeros serves to obfuscate the hash (though this
method is rather weak).

A different formula is used if the client did send an NTLM Response. The NTLM
Session Key is calculated like so:

    NTLM_Session_Key = MD4( NTLM_Hash );

Which means that the NTLM Session Key is the MD4 of the MD4 of the Unicode
password. The SNIA doc says there's only one MD4, but that would make the NTLM
Session Key the same as the NTLM Hash. Andrew Bartlett of the Samba Team says
there are two MD4s; the second does a fine job of protecting the
password-equivalent NTLM Hash from exposure.

Moving along to LMv2 and NTLMv2, we find that the Session Key recipe is slightly
more complex, but it's all stuff we have seen before. We need the following
ingredients:



v2hash  =  The NTLM Version 2 Hash, which we calculated back in section 2.8.5.2.
hmac  =  The result of the HMAC_MD5() function using the v2hash as the key and
the server challenge plus the blob (or blip) as the input data. The NTLMv2 hmac
was calculated in section 2.8.5.3, and sent as the first 16 bytes of the
response. The LMv2 hmac was calculated in section 2.8.5.6.



The LMv2 and NTLMv2 session keys are computed as follows:

    LMv2_Session_Key   = HMAC_MD5( v2hash, 16,   lmv2_hmac, 16 );
    NTLMv2_Session_Key = HMAC_MD5( v2hash, 16, ntlmv2_hmac, 16 );

The client is able to generate the Session Key because it knows the password and
other required information (because the user entered the required information at
the logon prompt). If the server is stand-alone, it will have the password hash
and other required information in its local SAM database, and can generate the
Session Key as well. On the other hand, if the server relies upon a Domain
Controller for authentication then it won't have the password hash and won't be
able to generate the Session Key.

What's a server to do?

As we have already pointed out, the MAC protocol is designed to prevent a
situation that looks exactly like pass-through authentication, so a pass-through
server simply cannot do MAC signing. A NetLogon-capable server, however, has a
special relationship with the Domain Controller. The NetLogon protocol is
secured, so the Domain Controller can generate the Session Key and send it to
the server. That's how an NT Domain member server gets hold of the Session Key
without ever knowing the user's password or password hash.

2.8.9.2 SEQUENCE NUMBERS

Both the client and server maintain an integer counter which they initialize to
zero. This counter is used as a message sequence number, and it gets incremented
for every message such that requests always have an even sequence number and
replies always have an odd.

The zero-eth message is always a SESSION SETUP ANDX message, but it may not be
the first SESSION SETUP ANDX of the session. Recall, from near the beginning of
the Authentication section, that the client sometimes uses an anonymous or guest
logon to access server information. Watch enough packet captures and you will
see that MAC signing doesn't really start until after a real user logon occurs.

Also, it appears from testing that the MAC Signature in the zero-eth message is
never checked (and that existing clients send a bogus MAC Signature in the
zero-eth packet). That's okay, since the authenticity of the zero-eth message
can be verified by the fact that it contains a valid response to the server
challenge.

Once the MAC signing has been initialized within a session, all messages are
numbered using the same counters and signed using the same Session Key. This is
true even if additional SESSION SETUP ANDX exchanges occur.

2.8.9.3 CALCULATING THE MAC

The MAC itself is calculated using the MD5 function. That's the plain MD5, not
HMAC-MD5 and not MD4. The input to the MD5 function consists of three
concatenated blocks of data:



 * the Session Key,
 * the Response, and
 * the SMB message.

We start by combining the Session Key and the response into a single value known
as the MAC Key. For LM, NTLM, and LMv2 the MAC Key is created like so:

    MAC_Key = concat( Session_Key, 16, Response, 24 );

The thing to note here is that all of the responses, with the exception of the
NTLMv2 Response, are 24 bytes long. So, except for NTLMv2, all auth mechanisms
produce a MAC Key that is 40 bytes long. (16 + 24 = 40). Unfortunately, the
formula for creating the NTLMv2 MAC Key is not yet known. It is probably similar
to the above, however. Possibly identical to the calculation of the LMv2 MAC
Key, or possibly the concatenation of the Session Key with the first 28 bytes of
the blob.

Okay, now you need to pay careful attention. The last few steps of MAC Signature
calculation are a bit fiddly.



 1. Start by re-acquainting yourself with the structure of the SMB_HEADER.EXTRA
    field, as described in section 2.4.2.1. We are particularly interested in
    the eight bytes labeled Signature.



 1. The sequence number is written as a longword into the first four bytes of
    the SMB_HEADER.EXTRA.Signature field. The remaining four bytes are zeroed.



 1. The MAC Signature is calculated as follows:
    
        data = concat( MAC_Key, MAC_Key_Len, SMB_Msg, SMB_Msg_Len );
        hash = MD5( data );
        MAC  = head( hash, 8 );
    
    In words: the MAC Signature is the first eight bytes of the MD5 of the
    MAC_Key plus the entire SMB message.



 1. The eight bytes worth of MAC Signature are copied into the
    SMB_HEADER.EXTRA.Signature field, overwriting the sequence number.

...and that, to the best of our knowledge, is how it's done.

2.8.9.4 ENABLING AND REQUIRING MAC SIGNING

Windows NT systems offer four registry keys to control the use of SMB MAC
signing. The first two manage server behavior, and the second pair represent
client settings.



Server:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanManServer\Parameters
EnableSecuritySignature The valid values are zero (0) and one (1).
If zero, then MAC signing is disabled on the server side.
If one, then the server will support MAC signing. RequireSecuritySignature The
valid values are zero (0) and one (1).
This parameter is ignored unless MAC signing is enabled via the
EnableSecuritySignature parameter.
If zero, then MAC signing is optional and will only be used if the client also
supports it.
If one, then MAC signing is required. If the client does not support MAC signing
then authentication will fail.



Client:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Rdr\Parameters
EnableSecuritySignature  The valid values are zero (0) and one (1).
If zero, then MAC signing is disabled on the client side.
If one, then the client will support MAC signing. RequireSecuritySignature The
valid values are zero (0) and one (1).
This parameter is ignored unless MAC signing is enabled via the
EnableSecuritySignature parameter.
If zero, then MAC signing is optional and will only be used if the server also
supports it.
If one, then MAC signing is required. If the server does not support MAC signing
then authentication will fail.

Study those closely and you may detect some small amount of similarity between
the client and server parameter settings. (Well, okay, they are mirror images of
one another.) Keep in mind that the client and server must have compatible
settings or the SESSION SETUP will fail.

These options are also available under Windows2000, but are managed using
security policy settings55.


2.8.10 NON SEQUITUR TIME

A mathematician, a physicist, and an engineer were sitting together in a
teashop, sharing a pot of Lapsang Souchong and discussing the relationship
between theory and practice. The mathematician said "One of my students asked me
today whether all odd numbers greater than one were prime numbers, so I provided
this simple proof:



> "Stated: All odd numbers greater than one are prime.
> 3 is prime,
> 5 is prime,
> 7 is prime,
> 9 is divisible by three, so it is odd but not prime.
> Contradiction; the statement is false."

"Interesting.", replied the physicist. "Perhaps I have the same student. I was
asked the same question today. I solved the problem using a thought-experiment,
as Galileo might have done. Our experiment was as follows:



> "By observation we can see that:
> 3 is prime,
> 5 is prime,
> 7 is prime,
> 9 is experimental error,
> 11 is prime,
> 13 is prime..."



Proof by Assertion:
This has got to be true!
-- Jonathan Young, PhD.   

The engineer interrupted before the physicist could draw a conclusion, and said
"Out in the field we don't have time to mess with theory. We just define all odd
numbers as prime and work from there. It's simpler that way."

Consider this as you contemplate what you have learned about SMB authentication.


2.8.11 FURTHER STUDY

You should now have all you need to create an SMB session with an SMB server. As
you become more comfortable with the system you will likely become curious about
the vast uncharted jungle of Extended Security. Don't be afraid to go exploring.
With the background provided here, and the guidebooks listed in the References
section, you are well prepared. If you get it all mapped out, do us all a favor:
write it up so that everyone can share what you've learned.

A few more bits of advice before we move along...



 1. Know what you've got to work with.
    
    This is one of Andrew Bartlett's rules of thumb. If you are trying to figure
    out how an encrypted token or key or somesuch is derived, consider the
    available functions and inputs. Existing tools and values are often re-used.
    Just look through the calculation of the NTLMv2 Response and you'll see what
    we mean.



 1. Trust but verify.
    
    Read the available documentation and make notes, but don't assume that the
    documentation is always right. The truth is on the wire. In some cases
    implementations stray from the specifications, and in other cases (eg., this
    book) the documentation is a best-effort attempt at presenting what has been
    learned. There are few truly definitive sources. Another factor, as you are
    by now aware, is that there is a tremendous amount of variation in the CIFS
    world. Something may work correctly in one instance only to surprise you in
    another.



 1. Don't be surprised.
    
    Don't go looking for weirdness in CIFS, but don't be surprised when you find
    it. If you expect bad behavior, you may miss the sane and obvious. A lot of
    CIFS does, in fact, make some sort of sense when you think about it. There
    are gotchas, though, so be prepared.

These guidelines are quite general, but they apply particularly well to the
study of SMB security and authentication.



--------------------------------------------------------------------------------


2.9 BUILDING YOUR SMB VOCABULARY

Looking back over our shoulders, we see that we have performed only two SMB
exchanges so far: the NEGOTIATE PROTOCOL and the SESSION SETUP. There may be a
TREE CONNECT shoved into the packet with the SESSION SETUP as an AndX, but we
haven't really described the TREE CONNECT in detail.

So, although we have covered a tremendous amount of material, our progress seems
rather pathetic doesn't it? What if the rest of SMB is just as tedious, verbose,
and difficult?

Relax. It's not.

Certainly there are other difficulties lying in wait, but the biggest ones have
already been identified and we are carefully avoiding them. If you pursue your
dream of creating a complete and competitive CIFS implementation then you may,
some day, need to know how things like MS-RPC and Extended Security really work
inside. Fortunately, you can do without them for now.

Let's just be clear on this before we move along:



> There is a lot you can do with CIFS without implementing any of the extended
> sub-protocols that SMB supports, but if you want to build a complete and
> competitive CIFS client/server implementation you will need to go well beyond
> the SMB protocol itself.



Everything should be made
as simple as possible,
but not simpler.
-- Albert Einstein   

That's why it has taken the Samba Team (with help from hundreds if not thousands
of people across the Internet) more than ten years to make Samba the
industrial-strength server system it is today. Tridge worked out the basics of
NBT and SMB in a couple of weeks back in 1991, but new things keep getting
tacked on to the system.

When implementing CIFS, the rule of thumb is this: Implement as little as
possible to do the job you need to do.

The minute you cross the border into uncharted territory you open up a whole new
world to explore and discover. Sometimes, you just don't want to go there. Other
times, you must.

Anyway, in the spirit of keeping things simple we will cover only a few more SMB
messages, and those in much less depth than we have done so far. There really is
no need to study every message, longword, bit, and string. If you've come this
far, you should know how to read packet captures and interpret the message
definitions in the SNIA doc. It is time to take the training wheels off and
learn to ride.


2.9.1 THAT TREE CONNECT THINGY

We have talked a lot about the TREE CONNECT ANDX REQUEST SMB. There was even an
example way back in section 2.4.4. The example looked like this:

    SMB_PARAMETERS
      {
      WordCount       = 4
      AndXCommand     = SMB_COM_NONE (0xFF)
      AndXOffset      = 0
      Flags           = 0x0000
      PasswordLength  = 1
      }
    SMB_DATA
      {
      ByteCount       = 22
      Password        = ""
      Path            = "\\SMEDLEY\HOME"
      Service         = "?????"  (yes, really)
      }

Notice that the TREE CONNECT includes a Password field, but that in this example
the Password field is almost empty (it contains a nul byte). If the server
negotiates Share Level security, then the password that would otherwise be in
the SESSION_SETUP_ANDX.CaseInsensitivePassword field will show up in the
TREE_CONNECT_ANDX.Password field instead. The password may be plaintext, or it
may be one of the response values we calculated earlier.

The TREE_CONNECT_ANDX.Path field is also worth mentioning. It contains the UNC
pathname of the share to which the client is trying to connect. In this example,
the client is attempting to access the HOME share on node SMEDLEY. Note that the
Path will be in Unicode if negotiated.

Finally there is that weird quintuple question mark string in the
TREE_CONNECT_ANDX.Service field. There are, as it turns out, five possible
values for that field:



String Meaning A: A filesystem share LPT1: A shared printer IPC An interprocess
communications named pipe COMM A serial or other communications device ?????
Wildcard

It's annoying for the client to need to know the kind of share to which it is
connecting, which is probably why the wildcard option is available. The server
will return the service type in the Service field of the Response. Note that the
Service strings are always in 8-bit ASCII characters--never Unicode.

The response (for LANMAN2.1 and above) looks like this:

    SMB_PARAMETERS
      {
      WordCount        = 3
      AndXCommand      = <Next ANDX command>
      AndXOffset       = <Next ANDX block offset>
      OptionalSupport  = <A bitfield>
      }
    SMB_DATA
      {
      ByteCount        = <variable>
      Service          = <"A:" | "LPT1:" | "IPC" | "COMM">
      NativeFileSystem = <"" | "FAT" | "NTFS">
      }

The example above shows the empty string, "FAT", or "NTFS" as the valid values
for the NativeFileSystem field. Other values are possible. (Samba, for instance,
has a configuration option that allows you to put in anything you like.) The
empty string is used for the hidden IPC$ share.

There are two bits defined in the OptionalSupport bitfield:



Bit Meaning SMB_SUPPORT_SEARCH_BITS
0x0001 The meaning of this bit is explained in the LANMAN2.1 documentation.
Basically, it indicates that the server knows how to perform directory searches
that filter out some entries based on specific file attributes. For example,
whether the DOS archive bit is set, whether the name represents a directory,
etc. This is old stuff and all current implementations should support it.
SMB_SHARE_IS_IN_DFS
0x0002 This bit, if set, indicates that the UNC name is in the Distributed File
System (DFS) namespace. DFS is yet to be covered.

There is a note in the SNIA doc that states that some servers will leave out the
OptionalSupport field even if the LANMAN2.1 or later dialect is negotiated. It
does not say whether SMB_SUPPORT_SEARCH_BITS should be assumed in such cases.


2.9.2 SMB ECHO

Here's a toy we can play with.

ECHO is really as simple as it sounds. It's sort of the SMB equivalent of ping.
The client sends a packet with a data block full of bytes, and the server echoes
the block back. Simple.

...but this is CIFS we're talking about.

Although the ECHO itself is simple, there are many quirks to be found in
existing implementations. We will dig into this just a tiny bit to give you a
taste of the kinds of problems you are likely to encounter. Let's start with a
quick look at the ECHO REQUEST structure:

    SMB_PARAMETERS
      {
      WordCount  = 1
      EchoCount  = <In theory, anything from 0 to 65535>
      }
    SMB_DATA
      {
      ByteCount  = <Number of data bytes to follow>
      Bytes      = <Your favorite soup recipe?>
      }



Neun und neunzig luftballons
Auf ihrem Weg zum Horizont.
-- 99 Luftballoons, Nena   

The EchoCount field is a multiplier. It tells the server to respond EchoCount
times. If EchoCount is zero, you shouldn't get any reply at all. If EchoCount is
9,999, then you are likely to get nine thousand, nine hundred, and ninety-nine
replies. We say likely because of the wide variety of weirdity that can be seen
in testing.

One bit of weirdation is that all of the systems that were tested would respond
to an ECHO REQUEST even if no SESSION SETUP had been sent and no authentication
performed. This behavior is, in fact, per design, but it means that any client
that can talk to your server from anywhere can ask for EchoCount replies to a
single request. (It would probably be safer for the server to send a
ERRSRV/ERRnosupport error message in response to an un-authenticated ECHO
REQUEST.)

Other strangisms of note:



 * In testing, Windows/9x systems returned an "Invalid TID" error unless the TID
   was set to 0xFFFF. Also, these systems sent back at most a single reply,
   handling EchoCount as if it were a boolean.



 * WindowsNT4 and Windows2K would try to send as many replies as specified in
   EchoCount. If the data block (SMB_DATA.Bytes) was very large (4K was tested)
   and the EchoCount very high (eg, 20,000), then the server would eventually
   give up and reset the connection.



 * Samba has an upper limit of 100 repetitions. Also, Samba sends the replies
   fast enough that multiple replies will be batched together in a single TCP
   packet. (That's normal behavior for a TCP stream.)



 * The WindowsNT4 (Service Pack 6) system used in testing failed to respond if
   the payload was greater than 4323 bytes. Windows2000 seems to have an upper
   limit of 16611 bytes, above which it resets the TCP connection.
   
   
   
   email
   
   --------------------------------------------------------------------------------
   
   From: Conrad Minshall, Apple Computer To: Chris Hertel Cc: Samba Technical
   Mailing List Subject: Re: Bizarre limit alert.
   
   I saw the same "packet drop" with an overlong WRITE_ANDX. The maximum buffer
   size an NT SP6 claims on the NEGOTIATE response is 0x1104 (4356). This limit
   is not on the data, the limit includes the SMB header (32 bytes) and the SMB
   command. Based upon the size of an ECHO command I'd expect you could send
   4319 bytes, not 4323, so on this topic you'll have to have the last word...
   sorry.
    
   
   
   
   No apologies. This is CIFS we're talking about.

The ECHO SMB may be one of those things that gets coded up just because it's in
the documentation and it seems easy. It also appears as though ECHO hasn't been
tested much. Certainly, the more it is stressed the more variation that can be
seen. There is, however, something to note in the last example in the list above
and in the message from Conrad: once you know what you're looking at, you will
find common themes that appear and reappear across a given implementation. These
common themes are derived from common internals, and they can provide many clues
about the inner workings of the implementation.

Another fine point highlighted by our quick look at the ECHO SMB is that TCP is
designed to carry streams of data--not discrete packets. This can be seen in the
results of the tests against Samba, in which multiple replies were contained in
a single TCP packet. At the other extreme, several TCP packets are needed to
transfer a single ECHO if it has a very large data payload. As a result, a
single read operation may or may not return one and only one complete SMB
message.



Oversimplification Alert:


--------------------------------------------------------------------------------

The RecvTimeout() function (provided way back in Listing 2.1) makes the
assumption that one complete SMB message will be returned per call to the recv()
function. That's a weak assumption. It works well enough for the simple testing
we have done so far, but it is not sufficient for a real SMB implementation.

A better version of RecvTimeout() would verify the received data length against
the NBT SESSION_MESSAGE.LENGTH field value to ensure that only one message is
read at a time, and that the complete message is read before it is returned.
 




2.9.3 READIN', WRITIN', AND 'RITHMATIC

Here is a quick run-down on some of the basic essentials of SMB.



OPEN_ANDX: This SMB is discussed in examples given throughout the SNIA doc, but
there is no actual writeup given there. That's because it was labeled as
"obsolescent" in the Leach/Naik CIFS draft. The NT_CREATE_ANDX SMB is now
considered the more fashionable choice. Servers must still support the OPEN_ANDX
SMB, however, and there are certainly clients that still send it (even under the
NT LM 0.12 dialect).

It's times like these that the earlier documentation comes in really handy.

The OPEN_ANDX SMB is used to gain access to a file for further processing
(reading, writing, that sort of thing). The open file is identified by a FID
(File ID). The FID, of course, is returned by a successful OPEN_ANDX call.



READ_ANDX: It seems fairly obvious. This one lets you read blocks of data from a
file (or device) on the server. The READ_ANDX request supports 64-bit file
offsets if the OffsetHigh field is present (if it is present, the WordCount will
be 12).

An oddity of the READ_ANDX is the MaxCountHigh field, which is only used if the
CAP_LARGE_READX capability has been set. MaxCountHigh is an unsigned long (four
bytes) that is supposed to hold the upper 16 bits (two bytes) of the unsigned
short (two byte) MaxCount field. Two problems with this:

 1. Why use a 32-bit field to hold 16 bits worth of data?
 2. Even with CAP_LARGE_READX set, the maximum large read is 64K. That should
    fit into the MaxCount field with no need for MaxCountHigh.

Play with it and see what happens. Should be interesting.



WRITE_ANDX: Allows writing to a file or device. This SMB can also be extended by
two words to include an OffsetHigh field, thus providing 64-bit offsets. There
is also a DataLengthHigh field that is comparable to the MaxCountHigh from the
READ_ANDX. In this case, though, the DataLengthHigh field is given as an
unsigned short. That's only two bytes, which makes more sense.



SEEK_ANDX: This one may be considered deprecated. Newer clients probably don't
need to send the SEEK_ANDX, but servers may need to support it just in case.



email

--------------------------------------------------------------------------------

From: Charles Caldarale To: jCIFS Mailing List

SMB_COM_SEEK is a useless SMB, since all of the read and write functions require
a file relative address. It's not surprising it wasn't used; it would have been
a waste of network bandwidth if it had been sent.

 - Chuck
 



See also the SNIA doc's comments regarding this SMB.



FLUSH: The SMB_COM_FLUSH has nothing to do with plumbing. It is sent by the
client to ask the server to write all data and metadata for an open file
(specified by its FID) to disk. If a FID value of 0xFFFF is given, the server is
being asked to flush all open files relative to the TID.



NT_CREATE_ANDX: This SMB is used to open, create, or overwrite a file or
directory. It offers a myriad of options for file attributes, file sharing,
security, etc. As the "NT" in the name implies, the NT_CREATE_ANDX SMB is
closely tied to the feature set offered by WindowsNT filesystem calls. Here's
where you start needing to know more about Windows itself.

One problem with complex calls such as this is that the number of permutations
gets to be very high, and it quickly becomes very difficult to test them all 56.
There are various reports describing combinations of values that can cause a
WindowsNT client or server to go BSOD (Blue Screen Of Death). Have fun with your
testing.

There is yet another version of this SMB known as the NT_TRANSACT_CREATE, which
is implemented as a sub-command of the SMB_COM_NT_TRANSACTION SMB. It is used to
apply Extended Attributes (EAs) or Security Descriptors (SDs) to a file or
directory.



CLOSE: All good things must come to an end. Close the file, say goodnight, sing
one more song, and get some rest.

Remember earlier when we talked about SMB messages as if we were dissecting some
strange, new species of multi-legged critter? Well, we've moved beyond
Entomology, Invertebrate Zoology, Taxonomy, and such. We're now studying really
complex stuff like Sociology, Psychology, and Numismatics, and we get to put the
little critters into Skinner boxes and see how they react to various stimuli.
It's important research, and there are all sorts of interesting things to
discover.

Consider, for example, the SMB_COM_COPY command. It's supposed to allow you to
copy a file from one location on the server to another location. That saves the
client from having to read the data over the wire and write it back again. A
good idea, eh? Unfortunately, no one seems to be able to get it to work--at
least, not against Windows servers. There has been some limited success in the
laboratory...



email

--------------------------------------------------------------------------------

From: Greg McCain To: Chris Hertel Subject: CIFS and SMB_COPY

Chris,

I found that smb_copy will in fact copy a file iff:
- the src file is in the root of the share
- you do not specify the full path to the file src and dest files in the
smb_copy command. Instead, just specify the names of the files (this is out of
spec.).

The resulting destination file will be named like the source file, minus the
first character. It will NOT be named as specified in the dest parameter. Hence
"smb_copy wanda -> fred" results in a second file "anda" in the root of the
share.

This works on the .NET server RC1 and Windows 2000 servers that I've tried. Hope
it helps.
 



SMB is an old protocol, and it has gotten sloppy over the years. As you work
your way through the SMB messages, implementing first the easy ones and then the
more difficult ones, keep this thought in mind: It's not your fault.

Say it to yourself now: "It's not my fault."

Very good.

That will prevent you from getting frustrated and doubting your own skill. It's
really not your fault.


2.9.4 TRANSACTION SMBS

We are going to blast through this, so you'd better get your running shoes on.

The purpose of the Transaction SMBs is to carry specialized sub-protocols.
Examples include the Remote Administration Protocol (RAP) and Microsoft's
implementation of DCE/RPC (MS-RPC). There are other, more esoteric sets of calls
as well. We will play with some of them when we get to the Browse Service.

Think of these sub-protocols as sets of function calls that are stretched across
the network. As suggested in figure 2.18, a function call is made on the client
side and the parameters and data are packed up and shoved across the network.
The call is then completed at the remote end and the results (if any) are packed
up and shoved back. In CIFS jargon, that's called a transaction.

[Figure 2.18]

Transactions are designed to be able to transfer more data than the limit
imposed by the negotiated buffer size. They do so by fragmenting the payload.
The protocol for sending large Protocol Data Units (PDUs) is described in a
variety of documents, but here is a quick run-down:



 1. A primary Transaction SMB is sent. It includes the total expected size of
    the transaction (so that the server can prepare to receive the data). It
    also contains as much of the data as will fit in a single SMB message. If
    everything fits, skip to step 4.



 1. The server sends back an interim response. If the interim response contains
    an error code then the transaction will be aborted. Otherwise, it is a
    signal telling the client to continue. The WORDCOUNT and BYTECOUNT fields
    are both zero in this message (it's a disembodied header).



 1. The client sends as many secondary Transaction SMBs as necessary to complete
    the transaction request.



 1. The server executes the called function.



 1. The server sends as many response messages as necessary to return the
    results. In some cases the request does not generate results, and no
    response is required.

There are three primary Transaction SMBs:



> SMB_COM_TRANSACTION  ==  0x25 SMB_COM_TRANSACTION2  ==  0x32
> SMB_COM_NT_TRANSACT  ==  0xA0

Those are really long names, so folks on the various mailing lists tend to
shorten them to "SMBtrans", "Trans2", and "NTtrans", respectively. Each of these
also has a matching secondary:



> SMB_COM_TRANSACTION_SECONDARY  ==  0x26 SMB_COM_TRANSACTION2_SECONDARY  == 
> 0x33 SMB_COM_NT_TRANSACT_SECONDARY  ==  0xA1

There is very little difference between these three transaction types, except
that the NTtrans SMB has 32-bit fields where the other two have 16-bit fields.
That means that NTtrans can handle a lot more data (that is, much larger
transactions). Besides that, the real difference between these three is the set
of functions that are traditionally carried over each.

The SNIA doc and the Leach/Naik CIFS draft provide examples of transactions that
use Trans2 and NTtrans. Calls that use SMBtrans are documented elsewhere. Places
to look include Luke's book (DCE/RPC over SMB), the Leach/Naik Browser and RAP
Internet Drafts, and the X/Open documentation (particularly IPC Mechanisms for
SMB). These (as you already know) are listed in the References section.

2.9.4.1 MAILSLOTS AND NAMED PIPES

Just to simplify things even further, SMBtrans supports yet another layer of
abstraction.

Mailslots and Named Pipes are used to access specific sets of remote functions.
For example, the "LANMAN" pipe (which is identified as \PIPE\LANMAN) is always
used for RAP calls.

Named Pipes are two-way inter-process communications channels. Once opened, they
can be read from or written to as if they were files. In contrast, Mailslots are
used for one-way, connectionless communications.

...and this is where something unexpected happens. Mailslot messages are sent
using SMBs transported via the NBT Datagram Service. You'll have to see it to
believe it, but that is easily arranged. All you need to do is grab a packet
capture of port 138 on an active LAN, one with a few local servers that announce
themselves to the working Network Neighborhood. If you don't like to wait,
reboot something. A Windows/9x system that offers shares will do nicely.

This topic will be revisited in the Browse Service section. If you want to do
some extra-curricular reading, the X/Open IPC Mechanisms for SMB document is
recommended.



--------------------------------------------------------------------------------


2.10 THE REMAINING ODDITIES

Promises were made, and promises should be kept.

Remember that closet full of concepts that burst open and spilled out all over
the floor? Well, we have managed to clean up a good deal of the mess, but there
are still a few things that we said we would put away--and we will. We can
provide a brief explanation of each of these as we shove them back into the
closet, just so you are not surprised when you stumble across them in the
literature.


2.10.1 OPPORTUNISTIC LOCKS (OPLOCKS)

OpLocks are a caching mechanism.

A client may request an OpLock from an SMB server when it opens a file. If the
server grants the request, then the client knows that it can safely cache large
chunks of the file and not tell the server what it is doing with those cached
chunks until it is finished. That saves a lot of network I/O round-trip time and
is a very big boost to performance.

The problem, of course, is that other clients may want to access the same file
at the same time. As long as everyone is just reading the file things are okay,
but if even one client makes a change then all of the cached copies held by the
other clients will be out of sync. That's why OpLock handling is a bit tricky.

There are two types of OpLocks that a client may request:



 * Exclusive
 * Batch

We came across these two when digging into the SMB_HEADER.FLAGS field way back
in section 2.5.2. In olden times, the client would request an OpLock by setting
the SMB_FLAGS_REQUEST_OPLOCK bit and, optionally, the
SMB_FLAGS_REQUEST_BATCH_OPLOCK bit in the FLAGS field when opening a file.
Now-a-days the FLAGS bits are (supposedly) ignored and fields within newer-style
SMBs are used instead.

Anyway, an Exclusive OpLock can be granted if no other client or application is
accessing the file at all. The client may then read, write, lock, and unlock the
cached portions of the file without informing the server. As long as the client
holds the Exclusive OpLock, it knows that it won't cause any conflicts. It's
sort of like a kid sitting in a corner of the kitchen with a spoon and a big ol'
carton of ice cream. As long as no one else is looking, that kid's world is just
the spoon and the ice cream.

Batch OpLocks are similar to Exclusive OpLocks except that they cause the client
to delay sending a CLOSE SMB to the server. This is done specifically to bypass
a weirdity in the way that DOS handles batch files (batch files are the DOS
equivalent of shell scripts). The problem is that DOS executes these scripts in
the following way:



 1. Set offset to zero (0)
 2. Open the batch file
 3. Seek to the stored offset
 4. If EOF, then exit
 5. Read one line
 6. Store the current offset
 7. Close the batch file
 8. Execute the line
 9. Go back to step 2

Yes, you've read that correctly. The batch file is opened and closed for every
line. It's ugly, but that's what DOS reportedly does and that's why there are
Batch OpLocks.

To make Batch OpLocks effective, the client's SMB layer simply delays sending
the CLOSE message. If the file is opened again, the CLOSE and OPEN simply cancel
each other out and nothing needs to be sent over the wire at all. That also
means that the client can keep hold of the cached copy of the batch file so that
it doesn't have to re-read it for every line of the script.

There is also a third type of OpLock, known as a Level II OpLock, which the
client cannot request but the server may grant. Level II OpLocks are,
essentially, "read-only" OpLocks. They permit the client to cache data for
reading only. All operations which would change the file or meta-data must still
be sent to the server.

Level II OpLocks may be granted when the server cannot grant an Exclusive or
Batch OpLock. They allow multiple clients to cache the same file at the same
time so, unlike the other two, Level II OpLocks are not exclusive. As long as
all of the clients are just reading their cached copies there is no chance of
conflict. If one client makes a change, however, then all of the other clients
need to be notified that their cached copies are no longer valid. That's called
an OpLock Break.

2.10.1.1 OPLOCK BREAKS

It's called an "OpLock Break" because it involves breaking an existing OpLock.
The more formal term is "revocation", but no one actually says that when they
get together after hours to sit around, drink tea, and whine about CIFS.

OpLock Breaks are sent from the server to the client. This is unusual, because
SMB request/response pairs are always initiated by the client. The OpLock Break
is sent out-of-band by the server, which is against the rules...but this is CIFS
we're talking about. Who needs rules?

The OpLock Break is sent in the form of a SMB_COM_LOCKING_ANDX message. The
server may send this to reduce an Exclusive or Batch OpLock to a Level II
OpLock, or to revoke an existing OpLock entirely. In either case, the client's
immediate responsibility is to flush its cache to comply with the new OpLock
status. If the client held an Exclusive or Batch OpLock, it must send all writes
to the server and request any byte-range locks that it needs in order to
continue processing. If the OpLock has been reduced to a Level II OpLock, the
client may keep its local cache for read-only purposes.

Note that there is a big difference between OpLocks and the more traditional
types of locks. With a traditional file or byte-range lock, the client is in
charge once it has obtained a lock. It can maintain it as long as needed,
relinquishing it only when it is finished using it. In contrast, an OpLock is
like borrowing your neighbor's lawnmower. You have to give it back when your
neighbor asks for it.

Support for OpLocks is optional on both the client and the server side, but
implementing them provides a hefty performance boost. More information on
OpLocks may be found in the Paul Leach/Dan Perry article CIFS: A Common Internet
File System (listed in the References), as well as the usual sources.


2.10.2 DISTRIBUTED FILE SYSTEM (DFS)

The CIFS Distributed File System (DFS) is not nearly as fancy as it sounds. It
is simply a way to collect separate shares into a single, virtual tree
structure. It also has some limited ability to provide fileserver redundancy and
load balancing.

The key feature of DFS is that it can create links from within a shared tree on
one server to shares and directories on another, thus providing a single point
of entry to a virtual SMB tree. From the user's perspective, the whole thing
looks like a single share, even though the resources are scattered across
separate SMB servers.

Clear as mud? Perhaps an illustration will help...

[Figure 2.19]

In figure 2.19, the client is shown attempting to access a file on server
PETSERVER. Well, that's where the client thinks the file resides. On the server
side, the name CORGIS in the DOGS directory is actually a link to another UNC
pathname: \\DATADOG\CORGIS. Following that link leads us to a different share on
a different server.

The server offering the DFS share (PETSERVER, in our example) does not act as a
proxy for the client. That is, it won't follow the DFS links itself. Instead,
the server sends an error code to the client indicating that there is some
additional work to be done. The error code is either a DOS code of
ERRSRV/ERRbadtype (0x02/0x0003), or an NT_STATUS code of
STATUS_DFS_PATH_NOT_COVERED (0xC0000257).

The client's task, at this point, is to query the server to resolve the link.
The client sends a TRANS2_GET_DFS_REFERRAL which is passed to the server via the
Trans2 transaction mechanism, briefly described earlier. The client will use the
information provided in the query response to create a new UNC path. It must
then establish an SMB session with the new server. This whole mess is known as a
"DFS referral".

It was mentioned above that DFS can provide a certain amount of redundancy. This
is possible because the links in the DFS tree may contain multiple references.
If the client fails to connect to the first server listed in the referral it can
try the second, and so on. DFS can also provide a simple form of load balancing
by reshuffling the order in which the list of links is presented each time it is
queried. Of course, load balancing and redundancy are only workable if all of
the linked copies are in sync.

A quick search of the web will turn up a lot of articles and papers that do a
better job of describing the behavior of DFS that the blurb provided here. If
you are planning on implementing DFS it is worthwhile to read up on the subject
a bit, just to get a complete sense of how it is supposed to work from the user
or network administrator's perspective. The SNIA doc provides enough information
to get you started building a working client implementation. The server side is
more complex because doing it right involves implementing a set of management
functions as well.


2.10.3 DOS ATTRIBUTES, EXTENDED FILE ATTRIBUTES, LONG FILENAMES, AND SUCHLIKE

These all present the same problem.

The CIFS protocol suite is designed, in its heart and soul, to work with DOS,
OS/2, and Windows systems. As a result, the protocols that make up the CIFS
suite have a tendency to reflect the behavior of those operating systems.

DOS, of course, is the oldest and simplest of the IBM/Microsoft family of PC
OSes. The filesystem used with DOS is the venerable File Allocation Table (FAT)
filesystem which, according to legend, was originally coded up by Bill Gates
himself. The characteristics of the FAT filesystem should be familiar to anyone
who has spent any time working with DOS. Consider, for example, the following
FAT features:



 * Case-Insensitive Filenames
   Case is ignored, though file and directory names are stored in upper-case.



 * 8.3 Filename Format
   The name is a maximum of eight bytes in length, optionally followed by a dot
   ('.') and an extension of at most three bytes. For example: FILENAME.EXT



 * No Users or Groups
   FAT does not understand ownership of files.



 * Six Attribute Bits
   FAT supports six attribute bits, stored in a single byte. The best known are
   the Archive, Hidden, Read-only, and System bits but there are two more:
   Volume Label and Directory. These are used to identify the type and handling
   of a FAT table entry.

It's a fairly spartan system.

There are improvements and extensions that have appeared over the years. The
FAT32 filesystem, for example, is a modified version of FAT that uses disk space
more efficiently and also supports much larger disk sizes than the original.
There is also VFAT, which keeps track of both 8.3 format filenames and longer
secondary filenames that may contain a wider variety of characters than the 8.3
format allows. VFAT long filenames are case-preserving (but not case sensitive)
so, overall, VFAT allows a lot more creativity with file and directory names57.

Even with these extensions, the semantics of the FAT filesystem are not
sufficient to meet the needs of more powerful OSes such as OS/2 and WindowsNT.
These OSes have newer, more complex filesystems which they support in addition
to FAT. Specifically, OS/2 has HPFS (High Performance File System) and WindowsNT
& W2K can make use of NTFS (New Technology File System). These newer filesystems
have lots and lots of features which, in turn, have to be supported by CIFS.

Problems arise when the server semantics (made available via CIFS) do not match
those expected by the client. Consider, for instance, Samba running on a Unix
system. Unix filesystems typically have these general characteristics:



 * Case-Sensitive Filenames
   Case is significant in most Unix filesystems. File and directory names are
   stored with case preserved.



 * Longer, More Complex Filenames
   Unix filesystems allow for a great deal of creativity in naming files and
   directories.



 * Users and Groups
   Unix filesystems assign user and group ownership to each directory entry.



 * More, and Different, Attribute Bits
   There are three sets of three bits each used for basic file access (read,
   write, and execute permissions for user, group, and world). There are
   additional bits defined for more esoteric purposes.

Now consider a Windows application that requires the old 8.3 name format. (Such
applications do exist. They make calls to older, 16-bit OS functions that assume
8.3 format.) Unlike VFAT, Unix filesystems do not normally keep track of both
long and short names. That causes a problem, and Samba has to compensate by
generating 8.3 format names on the fly. The process is called "Name Mangling".

There are other gotchas too. Indeed, name mangling is just the tip of the
proverbial iceberg.

One solution that some CIFS vendors have been able to implement is to develop a
whole new filesystem for their server platform, one that maintains all of the
required attributes and maps between them as necessary. This is a pain, but it
works in situations in which the server vendor has control over the deployment
of their product. One such filesystem is Microsoft's NTFS, which can handle a
very wide variety of attributes and map them to the semantics required by Apple
Macintosh clients, Unix clients, DOS clients, OS/2 clients...

You've got the basic idea. Let's run through some of the trouble spots to give
you a sense of what you're up against.



Long Filenames Long filenames can be much more descriptive than the old 8.3
format names. The problem, of course, is that CIFS must support both long and
short (8.3) names to be fully compatible with all of the potential clients out
there. Even if a server supports only the NT LM 0.12 dialect, there will still
be instances when the 8.3 format is required. Sigh.



DOS Attributes These are the six attribute bits that are supported by the FAT
filesystem. These do not map well to the file protections offered by other
filesystems. Compare these, for example, against the attribute bits offered by
Unix systems.

The timestamps stored in the FAT filesystem may also be different from those
used by other systems.



Extended File Attributes These are an extended set of attribute bits and flags
available on systems using the NTFS filesystem. They are a 32-bit superset of
the set offered by the FAT system. Extended File Attributes are described in
section 3.13 of the SNIA doc.

The term "Extended File Attributes" is also sometimes misused when discussing
NTFS permissions. Permissions are different. Permissions are associated with
Access Control Entries (ACEs), and ACEs are gathered together into Access
Control Lists. There's a whole bigbunch of stuff there that could be
explored--and would be, if this were a book about implementing NTFS.



Extended Attributes These should get special mention because, it seems, CIFS is
sufficiently complex that terminology has to be recycled. Extended Attributes
(EAs) are not the same as Extended File Attributes.

EAs are a feature of HPFS and, therefore, are supported by NTFS. Basically, they
are a separate data space associated with a file into which applications may
store additional data or metadata specific to the application (things like
author name or a file comment)58.

CIFS offers facilities to support all of these features and more. That's good
news if you are writing client code, because you can pick and choose the sets of
attributes you want to support. It's bad for server systems, which may need to
offer various levels of compatibility in order to contend with client
expectations.



--------------------------------------------------------------------------------


2.11 THAT JUST ABOUT WRAPS THINGS UP FOR SMB

If the Internet has proven anything it's that a very large number of primates
banging randomly on keyboards over a long enough period of time can and will
produce some amazingly useful software. On the other hand, if you gather some of
those primates together, place them into cubicles, and train them to perform
like circus animals...
 

Let's just get rid of these
horrible protocols.
-- Andrew Tridgell,
Samba Team Leader   

...well, we've just put a lot of effort into cleaning up the mess that was made
in those cubicles. A shame, really. It was a nice little protocol when it
started out.

Although the SNIA gave it their best shot, there are currently no industry
committees or standards groups writing bona fide specifications to be reviewed
and voted on, and no standard test suites to verify conformity. That's not to
say that specifications and test suites don't exist--quite the contrary. The
problem is that they have no teeth. With no real standards and no real
enforcement, the only measure of correctness for an SMB implementation is
whether or not it works most of the time. Since most of the clients out there
are Windows clients, the formula simplifies down to whether or not an
implementation works with Windows. An additional problem is that SMB itself is
not enough for true interoperability with Windows systems--particularly if you
want to write a workable server.

In San Jose, California, there is a mansion known as the Winchester Mystery
House. It started out as a simple farmhouse, but it was expanded over a period
of thirty-eight years by a millionaire widow with an obsessive compulsion to
keep on adding new rooms. It has stairways that rise directly into the ceiling,
windows in the floor, doors that open to solid walls...and that's just for
starters. The building covers four and a half acres and has an estimated 160
rooms.
 

 



  

CIFS is like that.

The original SMB protocol was simple and well suited to its environment. Over
the years, however, it has been greatly expanded. Several sub-protocols have
been added on as well. These subprotocols (which include such things as the
Extended Security protocols, RAP, MS-RPC, etc.) are implemented by Windows so,
if you want to build something truly compatible, SMB alone just isn't enough.

...but don't go away feeling that it is all just a hopeless mess. It is really a
question of how much effort you are willing to put into solving the problems you
will encounter. Take it one step at a time, because the individual pieces are
much less daunting than the whole.





--------------------------------------------------------------------------------

1 The X/Open SMB documentation is out of print, but electronic copies are now
available on-line (free registration required). See:
http://www.opengroup.org/products/publications/catalog/, and look for documents
#C195 and #C209.

2 I must rely on anecdotal evidence to support this claim. Due to the licensing
restrictions, I have not read these documents, which were released in March of
2002.

3 ...and Tactical Officer. He's the one with the prosthetic forehead.

4 I live in Minnesota, where it most definitely snows in winter. I share my home
with a Pembrokeshire Welsh Corgi and a Golden Retriever, so the springtime
scenario described above is vividly real and meaningful to me. Some of my
Australian Samba Team friends have suggested that people in other parts of the
world may find it less familiar. Use your imagination.

5 There are some old, archived conversations on Microsoft's CIFS mailing list
which suggest that some implementors were--and possibly still are--only allowing
for a 16 bit LENGTH field in the NBT SESSION MESSAGE.

6 Steve French says that OS/2 may have been the first OS to fully support the
UNC scheme.

7 The distinction between a URL and a URI is subtle, and confuses me to no end.
Fortunately, it is not something we need to worry about.

8 The "host" field is not really a field, but the name of a non-terminal in the
BNF grammar presented in RFC 2396. That grammar has been amended to support IP
version 6 (IPv6) addressing in RFC 2732. The SMB URL format adds support for the
use of NetBIOS names and Scope IDs, so it is a further extension of the syntax.

9 Additional source code is available at http://ubiqx.org/libcifs/.

10 See RFC 2732 for information on the use of IPv6 addresses in URLs.

11 Samba's nmbd daemon spawns a separate process to handle DNS queries, just to
get around this very problem.

12 When working with the NET USE command, it is important to remember to close
the connection to the server using the /d command-line option. Type NET HELP at
the DOS prompt for more information.

13 The original was much more detailed and interesting. It had to be edited so
that it would fit on the page, and because all those details can be distracting.

14 ...to me.

15 The first place to look is Microsoft's CIFS FTP site:
ftp://ftp.microsoft.com/developr/drg/CIFS/. The COREP.TXT file is formatted for
printing on an old-style dot-matrix printer, which makes it look a little goofy
in places (eg. bold font is accomplished by typing a character, then
backspacing, then re-typing the same character). The same content is available
in an alternate format in the file SMB-CORE.PS. See the References section.

16 Ethereal version 0.9.3 will report the name of the last AndX Command in the
chain, rather than the first. This was fixed somewhere between 0.9.3 and 0.9.6.
The trick with Ethereal is to update early and often.

17 We are dealing with a vague definition here. According to the SNIA doc, the
SESSION SETUP is meant to "set up" the session created by the NEGOTIATE
PROTOCOL, which also makes some sort of sense. Thing is, there may be multiple
SESSION SETUP exchanges following the NEGOTIATE PROTOCOL, meaning multiple SMB
user sessions per NBT or naked TCP transport session. The waters are muddy.

18 This is exactly what jCIFS does (up through release 0.6.6 and the 0.7.0beta
series). There has been a small amount of discussion about supporting the
NT_STATUS codes, but it's not clear whether there is any need to change.

19 After all that work... Sometime around August of 2002, Microsoft posted a bit
of documentation listing the DOS error codes that they have defined. Not all are
used in CIFS, but it's a nice list to have. In addition, they have documented an
NTDLL.DLL function that converts DOS error codes into NT_STATUS codes. [Thanks
to Jeremy for finding these.]

20 The English language is Copyright (C) 1597 by William Shakespeare & Co., used
by permission, all rights deserved.

21 One of the reasons that the jCIFS project was started is that Java has
built-in Unicode support, which solves a lot of problems. That, plus the native
threading model and a few other features, made an SMB implementation in Java
very tempting. Support for Unicode in a CIFS implementation is not really
optional any more except, perhaps, in the simplest of client systems.
Unfortunately, Unicode is way beyond the scope of this book. See the References
section for some web links to get you started with Unicode.

22 ...or was, last time I checked. Once again, that URL is:
ftp://ftp.microsoft.com/developr/drg/CIFS/. See the References section for links
to specific documents.

23 There may be a further problem with raw mode. Microsoft has made some obtuse
references to obscure patents which may or may not be related to READ RAW and
WRITE RAW. The patents in question have been around for quite some time, and
were not mentioned in any of the SMB/CIFS documentation that Microsoft released
up until March of 2002. Still, the best bet is to avoid ReadRAW and WriteRAW
(since they are not particularly useful anyway) and/or check with a patent
lawyer. The Samba Team released a statement regarding this issue. See:
http://us1.samba.org/samba/ms_license.html.

24 There is no name for 10-7seconds. Other fractions of seconds have names with
prefixes like deci, centi, milli, micro, nano, pico, even zepto, but there is no
prefix that applies to 10-7. In honor of the fact that this rare measure of time
is used in the CIFS protocol suite, I propose that it be called a bozosecond.

25 January 1, 1970, 00:00:00.0 UTC, known as "the Epoch", is sometimes excused
as being the approximate birthdate of Unix.

26 This is probably because Saint Paul is at the center of the universe. The
biomagnetic center of the universe used to be located across the river in
Minneapolis until they closed it down. It was a little out of whack in the same
way that the magnetic poles are not quite where they should be. The magnetic
north pole, for instance, is on or near an island in northern Canada instead of
at the center of the Arctic Ocean where it belongs.

27 A lot of time was wasted trying to figure out which configuration options
would change the behavior. The results were inconclusive. At first it seemed as
though the DomainName was included if the Windows98 system running in User Level
security mode, and passing logins through to an NT Domain Controller. Further
testing, however, showed that this was not a hard-and-fast rule. It should also
be mentioned that if the systems are running naked transport there may not be an
NT Domain or Workgroup name. SMB can be mightily inconsistent--but not all the
time.

28 To be pedantic, the correct terms are "marshalling" and "unmarshalling".
"Marshalling" means collecting data in system-internal format and re-organizing
it into a linear format for transport to another system (virtual, physical, or
otherwise). "Unmarshalling", of course, is the reverse process. These terms are
commonly associated with Remote Procedure Call (RPC) protocols, but some have
argued (not unreasonably) that SMB is a simple form of RPC.

29 If you enjoy digging into odd details, this is a great one. See the
SMB-LM1X.PS file, also known as Microsoft Networks/SMB File Sharing Protocol
Extensions, Version 2.0, Document Version 3.3. In particular, see the definition
of a VC on page 2, and the description of the "Virtual Circuit Environment" in
section 4.a on page 10.

30 See Microsoft Knowledge Base article #301673 for more information.

31 There are a few small notes scattered about the SNIA doc that suggest that
the prescribed compression algorithm is something called LZNT. I haven't been
able to find a definitive reference that explains what LZNT is, but it appears
from the name that it is a form of Lempel-Ziv compression.

32 It was, in fact, a lot of work for the Samba Team. Those involved did a
tremendous job, and they deserve several rounds of applause. Things were much
easier for jCIFS because Java natively supports Unicode.

33 Information on RAP calls is scattered among several sources, including the
archives of Microsoft's CIFS mailing list. The SNIA doc has enough to get you
started with the basics of RAP, but see also the file cifsrap2.txt which can be
found on Microsoft's aforementioned FTP site.

34 Luke Kenneth Casson Leighton's book DCE/RPC over SMB: Samba and Windows NT
Domain Internals is an essential reference for CIFS developers who need to know
more about MS-RPC.

35 I vaguely remember a conversation with Tridge in which he indicated that
there was an obscure exception to the misalignment of the Data block. I'm not
sure which SMB, or which dialect, but if I recall correctly there's one SMB that
has an extra byte just before the ByteCount field. Keep your eyes open.

36 In addition to "something you have " and "something you know " there is
another class of access token that is sometimes described as "something you
are". This latter class, also known as "biometrics", includes such things as
your fingerprints, your DNA pattern, your brainwaves, and your karmic aura. Some
folks have argued that these features are simply "something you have" that is a
little harder (or more painful) to steal. There was great hope that biometrics
would offer improvements over the other authentication tokens, but it seems that
they may be just as easy to crack. For example, a group of researchers in Japan
was able to fool fingerprint scanners using fake fingertips created from gelatin
and other common ingredients.

37 ...sort of. Support for inclusion of a password within a URL is considered
very dangerous. The recommendation from the authors of RFC 2396 is that new
applications should not recognize the password field and that the application
should instead prompt for both the username and password.

38 Yet again we seek the wisdom of the RFCs. See Appendix A of RFC 2396 for the
full generic syntax of URLs, and RFC 2732 for the IPv6 update.

39 See the discussion of the password level parameter in Samba's smb.conf(5)
documentation for more information about these problems.

40 I don't know whether a Windows server can be configured to support Unicode
plaintext passwords. To test against Samba, however, you need to use Samba
version 3.0 or above. On the client side, Microsoft has a Knowledge Base
article--and a patch--that addresses some of the message formatting problems in
Windows2000. See: Microsoft Knowledge Base Article #257292. Thanks to Nir Soffer
for finding this article.

41 If you are interested in the workings of DES, Bruce Schneier's Applied
Cryptography, Second Edition provides a very complete discussion. See the
References section.

42 ...without whom the Authentication section would never have been written.

43 Both the the X/Open doc and the expired Leach/Naik draft state that the
padding character is a space, not a nul. They are incorrect. It really is a nul.

44 The magic string was considered secret, and was not listed in the Leach/Naik
draft. The story of Tridge and Jeremy's (pre-DMCA) successful effort to
reverse-engineer this value is quite entertaining.

45 A "cracker", not a "hacker". The former is someone who cracks passwords or
authentication schemes with the goal of cracking into a system (naughty). The
latter is one who studies and fiddles with software and systems to see how they
work and, possibly, to make them work better (nice). The popular media has
mangled the distinction. Don't make the same mistake. If you are reading this
book, you most likely are a hacker (and that's good).

46 Jeremy Allison proved it could be done with a little tool called PWdump.
Mudge and other folks at the L0pht then expanded on the idea and built the now
semi-infamous L0phtCrack tool. In July of 1997, Mudge posted a long and detailed
description of the decomposition of LM challenge/response, a copy of which can
be found at: http://www.insecure.org/sploits/l0phtcrack.lanman.problems.html.
For a curious counterpoint, see Microsoft Knowledge Base Article #147706.

47 Andrew Bartlett prefers to call this the "NT Hash", stating that the NT Hash
is passed through the LM response algorithm to produce the NTLM (NT+LM)
response.

48 MD4 is explained in RFC 1320 and MD5 is in RFC 1321. HMAC in general, and
HMAC-MD5 in particular, is written up in RFC 2104. ...an embarrassment of
riches. As usual with this sort of thing, a deeper understanding can be gained
by reading about it in Bruce Schneier's Applied Cryptography, Second Edition.
See the References section.

49 The lab in the basement is somewhat limited which, in turn, limits my ability
to do rigorous testing of esoteric CIFS nuances. You should probably verify
these results yourself. Andrew Bartlett (him again!) turned up an interesting
quirk regarding the NTLMv2 Response calculation when authenticating against a
stand-alone server. It seems that the NT Domain name is left blank in the v2hash
calculation. That is: destination = "";

50 Luke Kenneth Casson Leighton's book DCE/RPC over SMB: Samba and Windows NT
Domain Internals gives an outline of the structure of the data blob used in
NTLMv2 Response creation. Using Luke's book as a starting point, the details
presented above were worked out during a late-night IRC session. My thanks to
Andrew Bartlett, Richard Sharpe, and Vance Lankhaar for their patience,
commitment, and sudden flashes of insight. Thanks also to Luke Howard for later
clarifying some of the finer points.

51 A quick web search for "LMCompatibility" will turn up a lot of references,
Microsoft Knowledge Base Article #147706 among them.

52 ...so that if something goes wrong you can blame them, and not me.

53 It might be worth doing some testing if you really want to use DOS codes in
your implementation, but also want Extended Security. It may be possible to use
the NT_STATUS codes for this exchange only, or you might try interpreting any
unrecognized DOS error code as if it were STATUS_MORE_PROCESSING_REQUIRED.

54 The latter name--though decidedly less Freudian--is somewhat gender-biased.

55 Jean-Baptiste Marchand has done some digging and reports that starting with
Windows2000 the SMB redirector (rdr) has been redesigned, which may impact which
registry keys are fiddled. The preferred way to configure SMB MAC signing in
Windows 2000 is to use the Local Security Settings/Group Policy Management
Console (whatever that is). Basically, this means that Windows2000 and WindowsXP
have MAC signing settings comparable to those in WindowsNT, but they are handled
in a different way.

56 I vaguely remember a presentation given by David Korn, author of the Korn
Shell (ksh), regarding AT&T's UWIN project. At the end of the presentation there
was some discussion regarding the differences between standard Posix APIs and
Win32 APIs. It was pointed out that there were hundreds or possibly thousands of
permutations of parameter values that could be passed to the Posix open()
function. The permutations for the equivalent Win32 function, it was reported,
was on the order of millions. How the heck do you test all those possibilities?

57 Digging through the documentation, it appears that the FAT family consists of
FAT12, FAT16, FAT32, and VFAT. There is documentation on the web that provides
implementation details, if you are so inclined.

58 NTFS is a complex filesystem based on some simple concepts. One such is that
each "file" is actually a set of "attributes" (records). Many of these
attributes are pre-defined to contain such things as the short name, the long
name, file creation and access times, etc. The actual content of the file is
stored in a specific, pre-defined "stream", where a stream is a particular kind
of attribute. NTFS supports OS/2-style Extended Attributes in another type of
NTFS attribute... and it just gets more confusing from there. There is a lot of
documentation on the web about the workings of NTFS, and there is a project
aimed at implementing NTFS for Linux.



--------------------------------------------------------------------------------

Copyright © 1999-2004 Christopher R. Hertel 
All rights reserved.   $Revision: 1.289 $