ldematte.com Open in urlscan Pro
174.138.187.134  Public Scan

Submitted URL: http://www.dematte.org/ct.ashx?id=990482da-5888-4fa0-b6ee-b842183c1716&url=https%3A%2F%2Fu.to%2FBPGlGA
Effective URL: http://ldematte.com/
Submission Tags: falconsandbox
Submission: On September 11 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

 * Home
 * Category
 * Subscribe
 * 
 * 
 * 
 * 

Signin


DREAMING IN CODE





SCUDERIA FERRARI

February 20, 2021 5:47

It has been a very long time since my last post.

The main reason is that at the end of 2016 I landed a new job within a wonderful
but sensitive industry: Formula 1. Specifically with Scuderia Ferrari, probably
the most secretive company among this tightlipped industry.

So for a while, I decided it was best to just give up writing rather than find
ways to say things without saying too much.

But now that I am saying "arrivederci", it is time for me to write a little
recap.

At the very beginning of 2017 I joined a wonderful team of very skilled, very
senior developers, and I had a blast working at such a high level for 4
wonderful years.

I had the chance to get insights on the development of race cars, visited places
very few people are admitted to, attend official Formula 1 events and worked in
the pitlane along cars, mechanics and drivers.
I still remember my first time at the track, and I’ll probably treasure that
experience forever. I was there for the first fire-up of the 1000hp Ferrari
engine in the morning, and believe me, even if these turbocharged hybrid power
units are not the 12 cylinder screamers of the olden days, listening to them
come to life gives me the chills.
And that’s not even the best part of the job.

I got to work in the team responsible for the software orchestrating all kind of
measured and simulated data. Data coming from real-time on-board ECUs, from
dynamics simulation of mechanical components, from dynos and benches, from the
mega simulators used by drivers and engineers to test cars, setups and tracks.
All kinds of high frequency telemetry were coming through our software, which
provided ways to run, read, correlate and analyze them and, of course, visualize
them.

A mighty piece of architecture with tens of services and programs all working
together, and I had the opportunity to work on it and help in shaping it.

But all good things come to an end, and so here I am, at the end of a chapter
and ready for a new adventure.

I want to write down the good lessons I learned during these four year; for me,
it’s very important to look back and ask myself “what have I learned?”

I can tell an experience was good to me if I realize I was able to learn a lot;
and even as a senior developer, I realized I was still able to improve and learn
a lot, and that’s fantastic. Here are the main lessons:

Be always ready to defend/attack ideas. The level of brainstorming and
discussion was mind-blowing. Everyone in the team and outside was always ready
to challenge ideas and respond to challenges. Ideas were thrown at you all the
time, and I quickly began to do the same. It can be intimidating, but it’s the
most efficient (if brutal) way to weed out mediocre ideas. Only solid, well
though design survive.

Code for change: in a business were evolution happens weekly (from one race to
the next one) and each year you have what basically is a new product, the only
way software can be written is with change in mind. Which is definitely NOT
design or write it to cover all scenarios, quite the opposite: make it simple,
and make it so it is easy to go in and change it, rewrite parts of it. I am sure
my refactoring skills got  to a new level!

What mission critical means: when the software you write can prevent a
multi-million euro race car from leaving the pits, missing a race or a qualy in
front of millions of people… you get a new idea of what mission critical is.
Sure, there are more critical things, related to safety and human lives, but for
the business here failure is not an option. Which gave me a whole new
perspective on…

Testing. What I learned and done in this area could cover a post by itself!

And finally, remote working. I was able to work the majority of the time
remotely, in a very smooth way. Sure, regular visit at the factory were
necessary (and pleasant) from time to time, but the team was organized to be
remote-first, I had previous experience with remote working, and I have to say
that when both parties are committed, it is great. It will be
difficult/impossible for me to give up remote working, there are too many
advantages (for both parties, I believe).

Of course, the shiny world of Formula 1 has its downsides. It can be exhausting,
cruel, and excessively competitive. But for me, it was worth it. It is an
experience that demanded a lot, but gave me a lot, and let me know lots of great
people some of which I can now call my friends.

Tags:
February 20, 2021 5:47
Share on Twitter or Facebook

--------------------------------------------------------------------------------


FORMULA 1 TESTING

February 19, 2021 2:03

I can't believe a year ago I was at the track deploying and testing our latest
streaming server and services...

 

The pandemic, budget cap and their consequences changed everything.

Good luck to Seb, Charles and all my old team mates!

Tags:
February 19, 2021 2:03
Share on Twitter or Facebook

--------------------------------------------------------------------------------


ON REST SERVICES PERFORMANCE

December 10, 2016 4:25

Recently, I had to investigate a “performance issue” a customer was having with
one of their web services.




To make it simple, the service is a REST API to get information about points of
interest. The response is quite large (hundreds of KBs) but nothing exceptional.

Several clients can perform multiple requests for the same POI, and the response
for a single POI is almost the same: it varies a little over time with real time
updates (traffic, info, last minute additions or cancellations) but it is
roughly the same. So, the code was already doing the right thing, and cached the
answer for each POI.

Well.. more or less the right thing. For a single POI, with ~1000 sub-items, the
response time for the first request was ~39 seconds. Subsequent requests
required half a second, so the caching was working.

The API is for consumption by a service, so there is no need to be “responsive”
(as in “users will need quick feedback or they will walk away”), but still: 39
seconds!




The API is implemented in Java (JAX-RS + JPA to be precise), so armed with the
profiler of choice (VisualVM) I started hunting for hot spots. Here is a list of
DOs and DON’Ts I compiled while investigating and fixing the issues which can
came handy. The list is not tied to Java, it is very general!




 * DO instrument your code with Log calls with timings at enter/exit of “hot”
   functions.

 * DO it at a logging level you can leave on in production (e.g.: INFO. But then
   leave INFO on!)

 * If you didn’t do that… you don’t have times :( But you need timing to see
   where you need to improve, so DO use a profiler!

 * DON’T just go for the function you believe is the slowest: a profiler trace
   may surprise you.

 * DO use a profile with instrumentation, not with sampling. In my experience,
   sampling is never precise enough.

 * When you have found your hot spots, DO move all costly, repeated operations
   you find to a place where they are done once (a constructor or initialization
   method). In this case, the offender was an innocent looking
   Config.get(“db.name”) method. Just to get the DB name from a config class.
   Which ended up opening a property file, reading it, parsing it every time.
   The method was doing a lot under the hood, but would you have looked at it
   without the hint from a profiler? See the previous point :)

 * DO cache data that does not change, if you are reading it from a DB or web
   service. Cache on the output is the basics, but it is often not nearly
   enough. You have to avoid multiple lookups for the same resource inside a
   single request!

 * DON’T do a DB (or even a cache!) lookup if you can find another way to get
   the same information, even when you need to re-compute a result (i.e. spend
   CPU time). In this service, each POI sub-item could be categorized in one of
   two classes using some of its attributes. The old implementation used a
   subset of attributes that needed to be checked with a DB lookup; I changed it
   to use a different set of attributes that needed a (simple) computation.

 * DO load the cache in bulk for small-ish sets of data. In this service, each
   string which contained information to be displayed to the user was looked up
   in a DB of “special cases” using complex fallback rules, each time generating
   a less refined query (up to 4). If nothing was found (~80% of the times), a
   default string was loaded from a Web Service. This operation alone accounted
   for 10 seconds, or 25% of the total time. The “not default” DB contains just
   around 4k items; a bulk query for all the rows requires only 100ms, can be
   easily stored in memory, and doing the filtering and matching in memory costs
   just a few ms more.

 * DO use simple libraries: communication with other Web Services was done using
   a very easy to use but quite heavy library (Jersey + Jackson for JSON
   deserialization). I switched to a custom client written with OkHttp and GSON,
   and the net save was 4 whole seconds.

 * DO enable compression on the result (if the user agent says it supports
   compression - most do!)

 * DO minimize copy and allocations: in this case (but this advice applies to
   Java in general), I used streams instead of lists whenever possible, down to
   the response buffer.

 * DON’T use the DB, especially NOT the same DB you use for you primary data, to
   store “logs”. In this case, it was access logs for rate limiting. A client
   hitting the service hard could consume a lot of resources just to generate a
   429 Too Many Requests response.
   Recording such an event to your primary DB is the perfect opportunity for a
   DoS attack.




Remember the times?

 * 39 seconds for the first request

 * 0.5 seconds for subsequent request son the same object

Now:

 * 1 second for the first request

 * 50 milliseconds (0.05 seconds) for subsequent requests on the same object




It is more than an order of magnitude. I’m quite happy, and so was the customer!
The time can be brought time even further by ditching the ORM framework (JPA in
this case) and going for native (JDBC) queries, changing some algorithms, using
a different exchange format (e.g. protobuf instead of JSON), but with increasing
effort and diminishing results. And for this customer, the result was already
more than they asked for.




Tags:
December 10, 2016 4:25
Share on Twitter or Facebook

--------------------------------------------------------------------------------


CONTAINERS, WINDOWS, AND MINIMAL IMAGES

December 08, 2016 5:41

Recently I have watched with awe a couple of presentations on Docker on Windows.
Wow… proper containers on the Windows kernel, I haven’t seen it coming! I
thought that “porting” cgroups and namespaces from Linux was something hard to
accomplish. Surely, all the bits where already almost there: Windows has had
something similar to cgroups for resource control (Jobs, sets of processes which
can enforce limits such as working set size, process priority, and end-of-job
time limit on each process that is associated with the job) since NT 5.1
(XP/Windows Server 2003), and NT has had kernel namespaces since its beginning
(NT 3.1). For details I recommend reading this excellent article: Inside NT
Object Manager.




However, seeing this bits put together and exposed to the userland with a nice
API, and Docker ported (not forked!) to use it, is something else.




Of course, I was instantly curious. How did they do that? The Windows Containers
Documentation contains no clue: all you can find is a quick start.




There a couple of videos on presentations done at DockerCon EU 2015 and
DockerCon 2016, but documentation is really scarce. Non-existent.

From the videos you understand that, as usual, Windows does not expose in an
official way (at least, for now) the primitives needed to create the virtual
environment for creating a container, but rather exposes a user-mode DLL with a
simplified (and, hopefully, stable) API to create “Compute Systems”. One of the
exposed functions is, for example, HcsCreateComputeSystem.




A search in MSDN for vmcompute.dll or, for example, HcsCreateComputeSystem, does
reveal nothing… the only documentation is found in a couple of GitHub projects
from Microsoft: hcsshim, a shim used by Docker to support Windows Containers by
calling into the vmcompute.dll API, and dotnet-computevirtualization, a .NET
assembly to access the vmcoumpute.dll API from managed languages.




Once it is documented, this “Compute Systems” API is surely something I want to
try out for Pumpkin.




Meanwhile… there is a passage in the presentations and in the official
Introducing Docker for Windows Server 2016 that left me with mixed feelings. You
cannot use “FROM scratch”  to build your own image; you have to start with a
“minimal” windows image.




Currently, Microsoft provides microsoft/windowsservercore or
microsoft/nanoserver.

The Windows Server Core image comes with a mostly complete userland with the
processes and DLLs found on a standard Windows Server Core install.  This image
is very convenient: any Windows server software will run on it without
modification, but it takes 10GB of disk space! The other base layer option is
Nano Server, a new and very minimal Windows version with a pared-down Windows
API. The API is not complete, but porting should be easy and it is less than
300MB.

But why do I need at least a 300MB image? The point of containers is to share
the kernel of the host, isn’t it?




The explanation is buried in one of the DockerCon presentations, and it makes a
lot of sense: the Win32 API is exposed to Windows programs through DLLs, not
directly as syscalls.


(Note: I will call it the “Win32 API” even on x64, because there really isn’t
any “Win64” API: it is the same! Just look at the names of the libraries:
kernel32, gdi32, user32, …)




Of course, internally those DLLs will make syscalls to transition to kernel mode
(sort of, more on this later), but the surface you program against in Windows is
through user mode DLLs.

What really hit me is the sheer number of “basic” components required by Windows
nowadays. I started to program using the Win32 API when there were only 3 of
these DLLs. OK, 4 if you count advapi32.dll.

Sure, then there was ole32, if you wanted to use OLE, and comctr32, if you
wanted “fancy” user controls, and Ws2_32 if you wanted sockets… But they were
“optional”, while now without  crss.exe, lsass.exe, smss.exe, svchost, wininit,
etc. you cannot even run a “no-op” executable.




Or can you?




I will take an extremely simple application, which “touches” (creates) a very
specific file (no user input, to make things easier), and try to remove
dependencies and see how far you can get (Spoiler alert: you can!)




I will divide my findings in three blog posts, one for each “step” I went
through, and update this post with links every time I post a new one.

 * Step 1: no C runtime
 * Step 2: no Win32 API
 * Step 3: no Native API (no ntdll.dll)

For the impatient: there is a github repo with all the code here :)



Tags:
December 08, 2016 5:41
Share on Twitter or Facebook

--------------------------------------------------------------------------------


INTEGRATE AN "IOT" DEVICE WITH A WEB APPLICATION: THE GOOD, THE BAD AND THE UGLY

October 07, 2016 10:19

An interesting scenario that I keep bumping into is: there is a device, which is
typically "headless" (no significant UI), but it performs some specific, useful
function ("IoT" device. Note the quotes). There is a web app, with a rich UI.
Your customers wants them to "talk". 

I smell trouble at the "talk" part. 



 * Data collection + display? Sure, IoT is born to do exactly this.
 * Control the device, issuing commands? You have to be careful with this one,
   naive solutions bring a lot of trouble.
 * Getting interactive feedback? (Command - response - Web UI update) Let's talk
   about this, because it can be done, but it is not so straightforward.



The trouble with this scenario is that it looks so simple to non technical
people (and less technical people alike... I heard an IT manager ask once how he
can wire up his scanner to our web application, but that's another story).
However, it is so easy to come up with bad or ugly solutions!

Fortunately, with a couple of new-ish technology bits and some good patterns, it
is possible to come up with a good solution.




THE GOOD 

Well... Yeah, it you follow The Good, The Bad and the Ugly trinity, you have to
start with "The Good". But I don't want to talk about the good already, it will
spoil all the fun!

Let's come back to the good later.





THE BAD



Sending commands to a device is quite easy. You open a socket on the device,
listening. You somehow know where the device is, which address it has (which is
an interesting problem in its own, but I digress), so from your web server you
just connect to that socket (address/port) and send your commands. You probably
don't have a firewall, but if you have one, just punch a hole through it to let
message pass through.



UGH. BAD.

Even if you go trough all the effort of making it secure (using an SSH tunnel,
for example, but I have seen plain text sockets with ASCII protocols. Open to
the Internet.), you are exposing a single port, probably on a low-power device
(like an ARM embedded device), possibly using a low-bandwidth channel (like
GPRS). How much does it take to DoS it? Probably you don't even need the first D
(as in DDoS), you could do it from a single machine.

But let's say you somehow try and cover this hole, maybe with a VPN, inserting a
field gateway in front of your "IoT" device(s), or putting a VPN inside the
devices themselves if they are powerful enough (and the aforementioned client
exists for you platform/architecture. Good luck with some ARM v4 devices with
1MB (Mega) disk space I have seen, but I digress again).

Great, you are probably relieved because now you can have interactive feedback!



You see, it is easy. Your user click on the page. On the web server, inside the
click handler (whatever this is: a controller, a handler, a servlet...) you open
a socket and send a command trough TCP, and wait for a response. The client
receives the command, process it, aswers back through the socket and closes the
connection. The web server receives it, prepares an HTTP response and returns it
to the web browser. Convenient!






Now you have thread affinity all the way: the same thread of execution spawns
servers, programs, devices. Blocking threads is a performance bottleneck in any
case but it is a big issue on a server. 

UGH. BAD.

If the network is slow (and it will be), you may end up keeping the web server
thread hanging for seconds. Let's forget about hanging the web browser UI (you
can put up a nice animation, or use Ajax), but keeping a web server thread hung
for seconds doing nothing is BAD. Like in "2 minutes, and your web server will
crash for resource exhaustion" bad. 





THE UGLY

IoT is difficult, real-time web applications are difficult, so let's ditch
them. 

We go back one or two decades, and write "desktop" applications. Functionalities
provided by the former web applications are exposed as web services, and we
consume them from our desktop app. Which is connected directly to the device.
Which is not an "Internet of Things" device anymore, maybe a "Intranet of
Things" device (I should probably register that term! :) ), if it is not
connected by USB. 

It makes sense in a lot of cases, if the device and the PC/Tablet/whatever are
co-located. But it imposes a lot of physical constraints (there is a direct
connection between the device and the PCs/Tablets that can control that device).
Also, if the app was a web app to begin with, there are probably good reasons
for that: easy of deployment, developers familiar with the framework, runs on
any modern web browser, ...

Especially if you discover that half your clients are using Windows PCs, the
other half Linux, and a third half Android tablets. Now you need to build and
maintain three different desktop applications. Which is an ugly mess.

Besides, how do you reach your "IoT"-device now, if it is on a private Intranet?
How do you update it, collect diagnostics and logs in a central location? You
can not, or you have to setup complicate firewall rules, policies, local update
servers. Again, feasible, but ugly.





THE GOOD (FINALLY)

The solution I came up with is to use WebSockets (or better, a multi-transport
library like SignalR) + AMQP "response" queues to make it good.

AMQP is a messanging protocol. It is a raising standard, and it is implemented
by many (most) queuing servers and event hubs (see my previous post). An
interesting usage for AMQP is to create "response queues". A hint on how this
might work is given, for example, in the RabbitMQ tutorial. The last tutorial in
the series describes how to code an RPC mechanism using AMQP. 






The interesting part of the tutorial is in the client code:






   var consumer = new QueueingBasicConsumer(channel);

   var replyQueueName = channel.QueueDeclare(exclusive: true, autoDelete: true).QueueName;

   

   channel.BasicConsume(queue: replyQueueName,

                        noAck: true,

                        consumer: consumer);

          

   // Do "Something special" here with replyQueueName                          

                             

   while(true)

   {

      var ea = (BasicDeliverEventArgs)consumer.Queue.Dequeue();

      if(ea.BasicProperties.CorrelationId == corrId)

      {

          return Encoding.UTF8.GetString(ea.Body);

      }

   }


   



   

The client declares a name-less (name is auto generated) queue. Plus, the queue
should be deleted on disconnection. There are two flags for that: exclusive and
autoDelete.

Than it does something with the queue name, and then continuously waits and
reads messages from this queue.

The "something special" is: communicate to the server the device availability,
and specify the name of the queue. This queue will be like an inbox for our
device: a place on a server where who wants to communicate with us will place a
message. The device (or better, a thread/task on the device) will wait for any
incoming message, read it, and then dispatch it. The device will act
accordingly.

It is important to note that the client is establishing the connection to a
server (the queueing system server), not the other way around. This prevents the
problem highlighted in the "Bad" section.

WebSockets (and related transport mechanism, like Server-Sent events, long
polling, etc.) allow code on the server side to push content to the connected
clients as it happens, in real-time.

So the device communicates with the web server using some standard way (direct
HTTP POST to the server, or even better posting to a queue, and then having a
worker read from the queue and POST to the web server, so you have queue-based
load levelling, and the server pushes the update to the client. 

Note that the server knows which devices are "on" and can be controlled by the
client (because the first thing a device does is to announce to the server its
availability, and where it can be contacted), and it can also know which client
is talking to which device, because traffic passes through the server:



Put the two together, and you have a working system for real time command +
control of "IoT"-like devices, with real-time feedback and response, from a
standard web application.



Tags:
October 07, 2016 10:19
Share on Twitter or Facebook

--------------------------------------------------------------------------------

Hello! I'm Lorenzo Dematte, a senior software dev working for Scuderia Ferrari.
Facts and opinions expressed in this blog are my own.

Twitter: ldematte
Github: github.com/ldematte
StackOverflow: ldematte

<< Older Posts

Copyright 2020 - Lorenzo Dematte