www.euphy.co.uk Open in urlscan Pro
77.92.69.132  Public Scan

Submitted URL: http://sandynoble.co.uk/
Effective URL: http://www.euphy.co.uk/
Submission Tags: phish.gg anti.fish automated Search All
Submission: On August 04 via api from DE — Scanned from GB

Form analysis 1 forms found in the DOM

GET http://www.euphy.co.uk/

<form role="search" method="get" id="searchform" class="searchform" action="http://www.euphy.co.uk/">
  <div>
    <label class="screen-reader-text" for="s">Search for:</label>
    <input type="text" value="" name="s" id="s">
    <input type="submit" id="searchsubmit" value="Search">
  </div>
</form>

Text Content

EXCITING STUFF CLUB




Menu Skip to content
 * Home


COFFEE BOSS DAY 10 – NEAT BOARD AND RFID

Been a while! Coffee boss has a great upgrade recently, that’s the current
sensor. This, combined with a new firmware that counts the time since the last
time the current went low (so signifying that a boil cycle has completed
therefore a new pot is ready) means that this is suddenly actually useful.

Pictured here is the various circuits combined on one piece of board. The back
is a bit of a rats-nest but this should all fit in a project box reasonably
neatly. I’ve made up a power cable with twisted mains cable where the individual
conductors can be separated out without having to cut any insulation but I need
to get that cleared off with the estates people – I would still prefer a sealed
moulded plug on the part that plugs into the coffee machine itself.

The reason why I put the current sensor on this part (near the machine, where
there water is) is because the other option is down on the floor, and the wire
going to the sensor isn’t very long. That’s still a decent option for a more
permanent installation.

Next to it is a new RFID reader, a board made by Elechouse based on a PN532
chip. This is a neat little beast but I’ve had a right battle getting it
working. Elechouse have some libraries for Arduino but there are some improved
ones here:

 * https://github.com/picospuch/PN532

This module has a serial, an SPI and an I2C interface, with a little set of dip
switches to control which one to use. I tried all kinds of ways and could. not.
get. it. working. on the ESP32, only to have it work first time with a regular
ATMEGA based Arduino board. Something about the ESP32 perhaps? My circuit?
Likely.

Cut a long story short, I have to do a full power cycle of the ESP32 to get the
PN532 to initialise properly. Reset won’t do it – presume the reset button only
resets the microcontroller rather than actually interrupts the power – and that
makes a difference.

Only just got this working this evening, and made up some neat cables. I’ll take
it for a test drive tomorrow.


WHAT’S THE RFID READER FOR?

Oh right, yes, it’s so that coffee drinkers can bump their work ID when they
have a cup of coffee and the machine will repeat back how many cups they’ve had
recently. It’s so they can think about how much they owe. It won’t ever know who
they are, but the cards that our IDs are printed onto have unique identifiers so
can track you across sessions.


THE CONNECTORS SAGA

 1. I decided to use Molex Mini-fit jr sockets on this because I had a few sets
    left-over from Polargraph drawing machines. I never actually used them on
    the Polargraph and they’ve been burning a hole in my pocket ever since since
    they are so adorable.
 2. I dismantled the Coffee Boss machine and stripped down the cable ends… And
    realised that the metal crimp contacts I had to go into the Mini-fit jr
    plugs were the wrong kind! I’ve got a reel of contacts for “Microlock Plus”
    system. As soon as I noticed, I remembered kicking myself for making that
    mistake the first time around. At least one reason why I never used these
    plugs and sockets.
 3. I ordered some crimp contacts for Mini-fit jr, along with a couple of sets
    of backshells for the plugs, so they look super pro. Expensive! But worth is
    for the pro-ness.
 4. When they arrived after the weekend, I crimped a couple onto the current
    sensor… And realised they wouldn’t fit into the plugs I had. I hadn’t wanted
    to test it before crimping them in case they got stuck in the plug and I
    couldn’t get them out.
 5. I realised that my plugs and sockets weren’t Mini-fit jr at all… They were
    Micro-fit 3.0.
 6. Always had been Micro-fit 3.0.
 7. And a cursory glance at the bags they were stored in confirmed that.
 8. So I ordered some crimp contacts for Micro-fit 3.0. Got them, fitted them
    tonight, and now we’re cooking.
 9. I wish there were some backshells to be had for the Micro-fit system, they
    are super pro.

This entry was posted in Uncategorized and tagged arduino, coffee-boss,
connectors, current sensor, elechouse, esp32, micro-fit-3.0, NFC, PN532, RFID on
December 3, 2019 by sandy.


COFFEE BOSS DAY 9 – A SIMPLER CURRENT ARRANGEMENT AND COMBINING SENSORS

So I made myself a cable like this, with a section of the outer insulation cut
away so that I could separate the insulated conductors inside and put the clamp
around just one of them.

When I tested it with the plain, unmodified coffee machine and found that it was
quite easy to tell when the heater was running and when the PTC hotplates were
on. Because of that I’m not going to bother with any internal interventions
inside the machine at all.

I used emonlib (https://github.com/openenergymonitor/EmonLib) and the plain
current_only.ino example. Worked great, and I noticed a few interesting things.
The clamp can be calibrated to give actual representative current values but I
don’t care about that, I just need any old numbers so I went with the standard
calibration numbers that were in the code.

 * Bottom PTC – peaked at 20,000 for less than a second when it turns on, then
   settles down to 2,500.
 * Top PTC – peaked at 19,000 and settled at 4,000.
 * Heater – immediate goes to about 52,000

So I think I can simply set a threshold around 40,000 and expect that breaking
this threshold means that the heater is on. There’s no ambiguity about the
hotplates and the heater. I’m going to make a cable with separate conductors
that won’t terrify the electricians at work, and just use that.

I’ve just committed
https://github.com/euphy/coffee_boss/blob/master/coffee_boss/coffee_boss.ino
which is the code to read all these things:

 * DS3231 Realtime clock on the i2c buss (SDA is 21, SCL is 22)
 * load cells with the HX711 ADC on pins 4 and 15
 * VCNL4010 proximity sensor on the i2c buss
 * current sensor on pin 14

And the breadboard looks like this:

Breadboard with sensors on it

Next up is to make a little mount for the proximity sensor that holds it near
the carafe.

This entry was posted in Uncategorized and tagged breadboard, coffee-boss,
current sensor, diagram, ds3231, hx711, proximity sensor, vncl4010 on November
13, 2019 by sandy.


COFFEE BOSS DAY 8 – HOW TO SENSE THE CURRENT

Doing a bit more digging about the current clamp tells me something I realised
was obvious in retrospect – the clamp needs to be around a live wire on it’s
own. Clamping it over the multi-core mains cable won’t work because the EM field
from the live wire and the return path through the neutral wire cancels each
other out.

That’s why the clamp last night (and today, with the proper value components
around it) shows no change whether the heater is on or not. OK. There’s some
high-end sensors that can sense current in multi-conductor cables but the only
accessible one I can see is this Modern Device one
(https://moderndevice.com/product/current-sensor/). It looks like it’ll work
great and isn’t expensive ($14), but it’s not a nice tidy clamp.

Options:

 1. Build a device that can be placed inline with the power cable that will
    allow the sensor access to only one of the wires in the cable. This could
    just be a customised power cable, quite easy to make but liable to raise an
    eyebrow from the estates people at work. This means having to understand the
    difference between the heater running and the hotplates.
 2. Mount the current clamp inside the coffee machine, over the wire that goes
    to the heater. This is non-invasive-ish, but I’m a little nervous about
    proximity to the heater and also having trailing wires hanging out of the
    coffee machine. This would give a nice unambiguous and accurate signal of
    when the heater is running though.
 3. Tap or sense the wire that the float switch is on. The float switch is in
    the water reservoir and is what triggers the heater to run. This may well be
    low voltage, and DC so I think a current sense clamp won’t make anything of
    it. The cable is terminated with plain plugs and sockets that look fairly
    standard so I could just make a special cable that has a sensor on it to
    pull a pin high or low when it’s closed or opened.
    I still don’t like the idea of having cables trailing out of the coffee
    machine. Perhaps I can make something discrete in the base – there is plenty
    of space, and routing for a cable out the bottom (the cable for the hotplate
    uses it). Maybe even some pogo pins on the bottom that would integrate with
    a little board mounted on the scales that the whole machine sits on.

I think option 3 is my favourite. It’s likely to avoid working with high voltage
or current and that cable assemble looks easy to do something with in a
reversible way. I need to see how to use a pin to sense the open or closedness
of that switch. I feel like that should be easy.


SOME MACHINE PARTS

Docs for parts: https://doczz.pl/doc/430837/bravilor-bonamat here’s an excerpt
that mentions the parts for the Novo:

 * PCB is 402650 or 6.101.153.000 – keypad PCB L 82mm W 65mm countries GB/IRL
   buttons: 2
 * Float switch is 347388 or 6.101.071.000 – magnetic switch NO connection
   plug-in connection cable length 160mm L 45mm mounting ø 6mm – what’s that
   connector on the end?
 * Heater is 417420 or 6.101.061.000 – flow heater 2000W 230V ø 58mm H 120mm
   connection male faston 6.3mm

Bonamat_ENU.pdfDownload
Bravilor-Bonamat-Novo-Old-ModelsDownload

I can’t find any description of the connector on the float switch. It looks like
a KK-396 housing https://uk.rs-online.com/web/p/pcb-connector-housings/6795066/
except measuring the pins looks like the pitch is 3.6 or 3.7mm rather than
3.96mm.

This entry was posted in Uncategorized and tagged clamp, coffee-boss, current,
flow-heater, novo, sensor on November 13, 2019 by sandy.


COFFEE BOSS DAY 7 – MOAR SENSORZ

Those last graphs were nice but I don’t think they tell me enough. The graphs
and the analyses make intuitive sense to the eye but they’re still disconnected
enough from concrete reality to make it hard to put the numbers into context.
I’ve run up against the limit for what my simple-minded analytics can do for me.
I could learn more analytic skills, you know like nunchuck skills, bowhunting
skills, computer hacking skills. But I won’t, I’m just going to brute force it
with more sensors.

I’ve got an AC current sensor (SCT-013-000 Non-invasive AC Current Sensor Clamp
Sensor 100A High QUALITY!!!) and an Adafruit VCNL4010 proximity sensor.


INSIDE THE COFFEE MACHINE

The coffee machine has a float switch in the water reservoir that it triggered
as long as there’s water left in. I don’t really know how the heater/pump works
or even if there is a pump there or if it’s some kind of expansion thing powered
by the heater on it’s own. I looked inside:

And I’m not sure what to make of that, doesn’t look like a mechanical pump but
water goes in the bottom that is discoloured from heat, and comes out the top
tube going up to the sprinkler-head.

There’s two wires running from the controller mechanism on the left to the
heater capsule, and they look the same as the ones leading from the main 240v
input connector (bottom left) as well as sharing the same connector, so I’m
going to surmise that this is a 240V AC heater.

Searching more, I find this: https://www.gastroparts.com/en/part-113792 which is
a “flow heater”, 2160W, 240V. Power divided by voltage gives current so 2160W /
240V = 9A.


THE CURRENT SENSING CLAMP

 * https://learn.openenergymonitor.org/electricity-monitoring/ct-sensors/interface-with-arduino
 * https://olimex.wordpress.com/2015/09/29/energy-monitoring-with-arduino-and-current-clamp-sensor/

Ideally I’ll put the current clamp on those two wires so it only senses when the
heater is running. I’d prefer not to have something mounted inside the machine
since I’ll get the blame if the office burns down.

So I’ll test this system with the clamp fitted on the main power cable instead,
and see how hard it is to recognise the “heater running” signal from the
“hotplate” on signal. If it’s obvious, then that’s ideal. If it is
indistinguishable then I’ll try it inside.

Looking at the link above, I should do some sums to figure out how to wire up
the current sensor. I don’t really understand this so will be Just Doing As I’m
Told.

Primary peak-current = RMS current × ?2 = 13A × 1.414 = 18.3A. 

I picked 13A as I wanted to leave a little headroom over the current that the
heater would draw in case the hotplate is on too. I can’t find any spares for
that bit yet so don’t know what that’ll draw.

Secondary peak-current = Primary peak-current / no. of turns = 18.3A / 2000 = 0.009191A

Ideal burden resistance = (AREF/2) / Secondary peak-current = 1.65 V / 0.009191A = 179.52 ?

I need a 180ish ohm burden resistor and a 10uF capacitor. I’ll get those from
the workshop tomorrow.

I wired this up and found that there was a constant stream of numbers coming
out, between 200 and 400, but turning the load on or off didn’t make any
difference. In fact it’s the same whether it’s clamped over a cable or not, so I
think this is just electrical noise. Boo.

Update! With the cap and burden resistor… Exact same response as last night ie
fairly stable number coming out of this sensor, but no relation to whether the
device is on or not. So… I was testing it with the clamp on the mains lead of a
heater. It needs to be on just one wire because both wires cancel each other
out. I’m going to have to build an intercept box for the power cable to go mount
externally, or mount the clamp inside the coffee machine, on one of the cables
near the heater. OK – that’s some more useful information. I wonder if there’s a
way to get the signal out of the machine without trailing wires out of it? I
don’t want it to fail it’s PAT test and inspection.


VCNL4010 PROXIMITY SENSOR

This is a short-range IR-based proximity sensor chip mounted on a neat little
breakout board. It uses i2c but there’s also an adafruit library for it so I
don’t need to cry about the bus.

I’m planning to mount this on a little plate directly on the coffee machine
behind the collar of the carafe so that it can be used to sense the presence or
absence of the coffee jug.


ADAFRUIT LIBRARIES

Adafruit have info about this module here
https://learn.adafruit.com/using-vcnl4010-proximity-sensor/overview and docs for
the arduino library here:
https://adafruit.github.io/Adafruit_VCNL4010/class_adafruit___v_c_n_l4010.html.
I tried the standard vcnl4010test.ino example that came with the
Adafruit_VCNL4010 library and found that the measurement maximum distance was
about 45mm which is _just_ too short for my application. It is sensitive up
close (5-20mm) but tails off the further away you get.

I noticed from the docs though that I could turn up the power on the LEDs in the
proximity sensors (it uses a time-of-flight thing!) so I turned that up to
maximum (indicating 200mA). Made no difference.

I could still mount this sensor on a little tower so it is closer to the carafe
ring. It would be fairly well protected from bumps but I did want to keep this
quite compact and unobstrusive. Is one of those ultrasonic detectors the answer?
I think I have one in that kit that Euan gave me that time. I’ll check tomorrow.



This entry was posted in Uncategorized and tagged arduino, coffee-boss, current,
vcnl4010 on November 8, 2019 by sandy.


COFFEE BOSS DAY 6.5: COMBINING A SCATTER WITH A LINE

So based on what I spotted in the source code of matplotlib’s _axes.py
(https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/axes/_axes.py#L1469-L1493):

    def plot(self, *args, scalex=True, scaley=True, data=None, **kwargs):
        """
        Plot y versus x as lines and/or markers.

        Call signatures::

            plot([x], y, [fmt], *, data=None, **kwargs)
            plot([x], y, [fmt], [x2], y2, [fmt2], ..., **kwargs)

        The coordinates of the points or line nodes are given by *x*, *y*.

        The optional parameter *fmt* is a convenient way for defining basic
        formatting like color, marker and linestyle. It's a shortcut string
        notation described in the *Notes* section below.

        >>> plot(x, y)        # plot x and y using default line style and color
        >>> plot(x, y, 'bo')  # plot x and y using blue circle markers
        >>> plot(y)           # plot y using x as index array 0..N-1
        >>> plot(y, 'r+')     # ditto, but with red plusses

I saw that I could use my axe to simply plot the positions of the indicators on
in two dimensions. I got this:

Which is pretty much perfectly what I want right now. I did some fairly dirty
mucking around with the data to get it to do this, essentially looking for where
the row-to-row weight difference crosses a threshold from low-to-high.

# median filter with a rolling window: low pass filter
df['rolling4'] = df['weight'].rolling(4).median()

# normalise by looking for difference over 8 samples
df['diff'] = df['rolling4'].diff(periods=-8)

# Tag with True where the change is over 300g
threshold = 300.0
df['thresholded'] = (df['diff'] > threshold)

# Produce 'highlight' boolean where the threshold is True, AND
# the threshold for the previous row was False. This feels pretty clunky.
df['highlight'] = (df['thresholded'] == True) & (df['thresholded'].shift(1) == False)

# Now create a new dataframe with just the highlights in, and only the interesting columns
highlights = df[df['highlight']][['datetime', 'rolling4']]

That’s good isn’t it?

This entry was posted in Uncategorized and tagged analysis, coffee-boss,
matplotlib, pandas, python on October 29, 2019 by sandy.


COFFEE BOSS DAY 6: HORIZONTAL LINE

I’ve been trying to get a horizontal line to show a threshold. It never worked.
It gave me a mean-spirited error message that I couldn’t understand. I spent the
last few days trying. I got this one:

Traceback (most recent call last):
  File "C:/Users/sandy_000/PycharmProjects/coffee_boss/viz/viz.py", line 65, in <module>
    df.plot(y=['diff'], ax=ax3)
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_core.py", line 794, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_matplotlib\__init__.py", line 62, in plot
    plot_obj.generate()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_matplotlib\core.py", line 284, in generate
    self._adorn_subplots()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_matplotlib\core.py", line 472, in _adorn_subplots
    sharey=self.sharey,
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 316, in _handle_shared_axes
    _remove_labels_from_axis(ax.xaxis)
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\pandas\plotting\_matplotlib\tools.py", line 281, in _remove_labels_from_axis
    for t in axis.get_majorticklabels():
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\axis.py", line 1252, in get_majorticklabels
    ticks = self.get_major_ticks()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\axis.py", line 1407, in get_major_ticks
    numticks = len(self.get_majorticklocs())
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\axis.py", line 1324, in get_majorticklocs
    return self.major.locator()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\dates.py", line 1431, in __call__
    self.refresh()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\dates.py", line 1451, in refresh
    dmin, dmax = self.viewlim_to_dt()
  File "C:\Users\sandy_000\venv\coffee_boss\lib\site-packages\matplotlib\dates.py", line 1202, in viewlim_to_dt
    .format(vmin))
ValueError: view limit minimum 0.0 is less than 1 and is an invalid Matplotlib date value. This often happens if you pass a non-datetime value to an axis that has datetime units

So what I did was change

ax3.axhline(threshold, linewidth=1, color='r')

df.plot(y=[small_window['name'], 'thresholded'], secondary_y=['thresholded'], ax=ax1)
df.plot(y=[small_window['name']], ax=ax2)
df.plot(y=['diff'], ax=ax3)

To

df.plot(y=[small_window['name'], 'thresholded'], secondary_y=['thresholded'], ax=ax1)
df.plot(y=[small_window['name']], ax=ax2)
df.plot(y=['diff'], ax=ax3)

ax3.axhline(threshold, linewidth=1, color='r')

Yes. Same, but the hline happens after the plot. OK, I can make the intuitive
leap for why this works and not be cross about it, but I wish I’d tried this a
week ago.

This entry was posted in Uncategorized and tagged analysis, coffee-boss,
matplotlib, pandas, python on October 28, 2019 by sandy.


COFFEE BOSS DAY 5: FURTHER ADVENTURES

I’ve been doing a bit more work which is about:


ANALYSIS

Finding an algorithm or treatment (I don’t know what the right word is… an
analysis?) that will isolate significant changes to the weight of the machine,

 1. applying a high-cut filter using a median filter in a rolling window,
 2. then producing a diff to create something normalised,
 3. then thresholding that to produce some binary output

column_names = ['datestamp', 'date', 'time', 'weight']
df = read_csv('../output/datr20190911.csv', names=column_names, parse_dates=True, infer_datetime_format=True)
df = df.append(read_csv('../output/datr20190912.csv', names=column_names, parse_dates=True, infer_datetime_format=True))
df = df.append(read_csv('../output/datr20190913.csv', names=column_names, parse_dates=True, infer_datetime_format=True))
df = df.append(read_csv('../output/datr20190914.csv', names=column_names, parse_dates=True, infer_datetime_format=True))

df['datetime'] = pd.to_datetime(df['datestamp'])
df.set_index('datetime')
del df['datestamp']
del df['time']
del df['date']

small_window = {'name': 'rolling4', 'size': 4}
df[small_window['name']] = df['weight'].rolling(small_window['size']).median()

df['diff'] = df[small_window['name']].diff(periods=8)

threshold = 600.0
df['thresholded'] = (df['diff'] > threshold) * 1


DISPLAYING IT TO CHECK THE ANALYSIS

Find how to display the charts in a way that lets me see what I’m actually
doing. This has evolved into what’s below.

fig1, (ax1, ax2, ax3) = plt.subplots(nrows=3, ncols=1, sharex='all')

ax1.grid(b=True, which='major', color='#666666', linestyle='-')
ax1.minorticks_on()
ax1.grid(b=True, which='minor', color='#999999', linestyle='-', alpha=0.2)

ax2.grid(b=True, which='major', color='#666666', linestyle='-')
ax2.minorticks_on()
ax2.grid(b=True, which='minor', color='#999999', linestyle='-', alpha=0.2)

ax3.grid(b=True, which='major', color='#666666', linestyle='-')
ax3.minorticks_on()
ax3.grid(b=True, which='minor', color='#999999', linestyle='-', alpha=0.2)


df.plot(x='datetime', y=[small_window['name'], 'thresholded'], secondary_y=['thresholded'], ax=ax1)
df.plot(x='datetime', y=[small_window['name']], ax=ax2, figsize=(8, 8))
df.plot(x='datetime', y=['diff'], ax=ax3)

plt.tight_layout()
plt.show()



This entry was posted in Uncategorized and tagged analysis, coffee-boss, esp32,
matplotlib, pandas on October 25, 2019 by sandy.


COFFEE BOSS DAY 4: WHAT AM I ACTUALLY TRYING TO DO

Matplotlib and pandas have a couple of fundamental principles that I’m not
getting. There seems to be an odd mix of global and specific commands that go
into expelling a graph and I’m not seeing the link.

 * http://jonathansoma.com/lede/algorithms-2017/classes/fuzziness-matplotlib/how-pandas-uses-matplotlib-plus-figures-axes-and-subplots/
 * http://jonathansoma.com/lede/algorithms-2017/classes/fuzziness-matplotlib/understand-df-plot-in-pandas/

Naturally this is causing me to bump into some awkward questions, the main one
being “what am I actually trying to do?”. I thought this was simple, but it’s
not quite. I sketched the following manually as capturing what I’d like:

This chart shows the features I think I need to gather:

 1. Weight of each cup of coffee. The height of the grey boxes show this. This
    can be recognised by seeing a rapid drop in weight where the size of the
    drop is greater than can be explained by evaporation. I want to know this so
    that I can see the variance between the biggest and the smallest cups.
    Everyone pays the same.
 2. Freshness of the pot of coffee (time since last pot). The first vertical
    line shows the start of a new pot. I can intuitively recognise this point as
    being where there is a sudden increase in weight of about 2kg. This is
    obvious in some cases (like the end of the figure below) where the weight is
    low and rapidly increases.
    It is less obvious in the refill from the beginning of the figure below,
    where the weight beforehand was high too so there isn’t that clear jump from
    very low to very high. I assume in this case, there was already a spare pot
    of water on top of the machine waiting to be used, and so the weight drop
    (that is visible) only lasts the time between picking the pot up and pouring
    it into the machine.
 3. Number of cups in each pot. This is a simple count of the number of events
    recognised in 1. I can’t see any way to determine if the last small drop
    before a refill is a cupful (ie someone’s taking it) or if it’s just waste.
    A combination of age of coffee and size of cup may form a heuristic for that
    but I don’t know how to gather the data from the scales alone. I might add a
    button on the touchscreen for “discarded waste/reset pot”.

The width (or length) of the grey boxes is interesting (indicating the time
between cups), but I’ve got no direct need for that data yet.


MAKE IT TIME-SERIES

Right now, the data is arranged in time sequence, and has a fixed sampling
frequency, so it is a complete time-series. However, pandas doesn’t know that
yet, the labels for time are just strings. I’ll make it into a true time-series
because Pandas has a bunch of specific tools for working with time-series data
(including resampling and how I specify the size of windows in seconds rather
than samples) AND I want to be able to combine multiple days into one stream of
data.

 * https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html
 * https://chrisalbon.com/python/data_wrangling/pandas_time_series_basics/
 * https://jakevdp.github.io/PythonDataScienceHandbook/03.11-working-with-time-series.html

Remember to that the first tutes assumed I was converting to datetimes during
import using parse_dates=True in the read_csv(...). That never worked for me,
and I got errors I didn’t understand. Use df.info() to check whether the
conversion had worked properly, it now looks like:

column_names = ['datestamp', 'date', 'time', 'weight']
df = read_csv('../output/datr20190923.csv', names=column_names, parse_dates=True, infer_datetime_format=True)
df['datetime'] = pd.to_datetime(df['datestamp'])
df.index = df['datetime']
del df['datestamp']
del df['time']
del df['date']
print(df.info())

and gives me:

[41141 rows x 5 columns]
 
 RangeIndex: 41141 entries, 0 to 41140
 Data columns (total 5 columns):
 weight        41141 non-null float64
 datetime      41141 non-null datetime64[ns]
 rolling4      41138 non-null float64
 rolling36     41106 non-null float64
 pct_change    41105 non-null float64
 dtypes: datetime64ns, float64(4)
 memory usage: 1.6 MB

When I run it. There’s a datetime64 object in there which is good! I wonder why
it didn’t work last week? Furthermore,

data = pd.Series(df['pct_change'])
print(data)

Now gives me a time-indexed series:

datetime
 2019-09-23 00:00:01         NaN
 2019-09-23 00:00:03         NaN
 2019-09-23 00:00:05         NaN
 2019-09-23 00:00:07         NaN
 2019-09-23 00:00:09         NaN
                          …   
 2019-09-23 23:59:51    0.000046
 2019-09-23 23:59:53    0.000000
 2019-09-23 23:59:55    0.000043
 2019-09-23 23:59:57   -0.000043
 2019-09-23 23:59:59    0.000000
 Name: pct_change, Length: 41141, dtype: float64



This entry was posted in Uncategorized and tagged analysis, coffee-boss, esp32,
matplotlib, pandas on October 5, 2019 by sandy.


COFFEE BOSS DAY 3: LOOKING FOR EVENTS

I don’t know how to do this bit. Not I don’t know technically, I mean I have no
awareness of the nature of the tools and practice to look for events in a data
stream, catergorise them, and present them.

My opening gambit is:

 1. Look through each weight sample, comparing it to the last (or the last few).
 2. If the current value is higher or lower (over a certain threshold) than it
    was, then:
 3. Record this as a significant event by putting it into another list with the
    same timestamp (events)
 4. Combine the events stream with the main data frame
 5. Present the raw weights data in a graph, and:
 6. Show the events overlaid

I can iterate through each row just using iterators and python loops, but that
feels like a pandas anti-pattern. From reading around (how do I even describe
this problem for google?), it seems like it’s best to do things in pandas en
masse rather than by examining each record individually. I think that’s what
pandas does.

df['pct_change'] = df[large_window['name']].pct_change()
df.plot(x='time', y=[large_window['name'], 'pct_change'], secondary_y=['pct_change'])

That’s a bit like what I’m looking for. The pct_change (percent change:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.pct_change.html?highlight=pct_change#pandas.Series.pct_change)
will detect the scale of changes. It’ll hover around 0, but you can see where
the big jumps are, then the percent change is also big.

This also uses the second_y kwarg which seems barely documented and most guides
suggest a different approach. Here is something:
https://stackoverflow.com/questions/29685887/secondary-y-true-changes-x-axis-in-pandas.

A negative percent change means the coffee machine is lighter (ie the pot is
lifted or a cup is taken). A positive percent change means the machine got
heavier (ie pot replaced or water refilled).

I can look through those percent changes and spot ones bigger than [a certain
value], and mark those cases on the plot or save them out somehow for further
analysis.

This entry was posted in Uncategorized and tagged analysis, coffee-boss, esp32,
matplotlib, pandas on October 5, 2019 by sandy.


COFFEE BOSS DAY 2: DOING SOME WORK.

After PUBG last night, I spent a few hours trying to understand pandas and
matplotlib. At 1:30am I called it a day. 10 hours into this task.

Looking at:
https://machinelearningmastery.com/time-series-data-visualization-with-python/
And Julia Evans’ https://github.com/jvns/pandas-cookbook.


DATA DESCRIPTION

The machine logs into two files:

 * datr<date>.csv which is a regular log of raw measurements from the scale. The
   frequency of this logging is set using the regularLogInterval variable in
   coffee_boss.ino. These values are unfiltered, and contain all of the weird
   noise that this circuit collects. They are, however, a true time-series.
   The R in datr stands for Regular. But Raw works too. These files end up big
   (~3Mb).
 * datc<date>.csv which is a log of changes greater than a particular threshold
   which is designed to be just about the smallest thing that can happen with
   the machine. That threshold is specified in the changeThreshold variable in
   coffee_boss.ino. It’s currently 30, so this file will only log changes
   greater than 30 grams. This stream of measurements is also filtered, being a
   running median of the last 8 raw measurements. This filters out almost all of
   the noise. It uses the RunningMedian library for this. This is not a true
   time series, since the logging frequency is not constant.
   The C in datc stands for Change. I wish I’d thought of better names. These
   files are only a couple of Kb in size.

Both log files have the same format. CSVs with four columns:

 * datetime (%Y-%m-%dT%H:%M:%S)
 * date (%Y-%m-%d)
 * time (%H:%M:%S)
 * weight (float with 2 decimal places)

You can find some examples of these files in
https://github.com/euphy/coffee_boss/tree/master/output. They look like this:

2019-09-26T02:57:53,2019-09-26,02:57:53,2249.48
2019-09-26T03:07:55,2019-09-26,03:07:55,2221.33
2019-09-26T03:08:51,2019-09-26,03:08:51,2119.35
2019-09-26T03:08:51,2019-09-26,03:08:51,1886.68
2019-09-26T03:08:51,2019-09-26,03:08:51,1895.43

I want to see this as a line graph with time along the X axis stretching from
left to right, and weight on the Y-axis, bottom to top. Doing this with the datR
files would be easiest, because they are naturally already time-series data, but
they give a pretty awful output because they are so noisy. I’d have to do some
filtering on it in python. That’s not such a bad idea.

The datC files are already filtered, but they are not in time series, so I have
to either:

 * use the datR files and figure out how to filter the noise out of them. This
   seems like pandas work. I don’t know how to use pandas.
   Or
 * use the datC files and figure out how to present them as time series –
   interpolation of missing points perhaps, or some other built-in way to do
   this in matplotlib. I don’t know how to use matplotlib.


SIMPLE PLOT WITH PANDAS AND MATPLOTLIB

I’ve started with the datC approach:

from pandas import read_csv
from matplotlib import pyplot

column_names = ['datestamp', 'date','time','weight']
series = read_csv('../output/datc20190926.csv', names=column_names)
series.plot()
pyplot.show()

Which renders a nice graph:

This shows what I expect, in a way. Two pots of coffee made, with about six cups
being taken from each one.

There’s something wrong though. The first third is all a bit messy. Not sure
what’s happening there, so well look at the time, and hm, there isn’t even a
time notes. The x-axis is a count of samples, not a time series. I can’t tell if
that first disorganised section is an hour or nine hours. I can’t tell if the
first pot of coffee was drank in an hour or in twelve. Given that this data
covers a full day (from midnight to midnight), it looks like one pot of coffee
lasted all day, and there was another one made late at night (7pm maybe?).


SIMPLE PLOT OF RAW, TIME-SERIES DATA

Lets try with the raw data (output/datr20190926.csv):

Ok that’s no better. There’s a few bad samples in there that has obscured the
good samples. Can I filter it? It still hasn’t got the right X-axis either,
measuring samples rather than time.


FILTER AND REMOVE OUT-OF-RANGE SAMPLES

Filtering out values that I _know_ are bad will help and is easy. See
https://stackoverflow.com/questions/29594841/how-to-filter-out-values-by-values-in-pandas-dataframe.

from pandas import read_csv
from matplotlib import pyplot

column_names = ['datestamp', 'date', 'time', 'weight']
series = read_csv('../output/datr20190926.csv', names=column_names)
series = series[(series['weight'] > -2000) & (series['weight'] < 10000 )]
series.plot()
pyplot.show()

Which filters out weights less than -2000g and more than 10000g. This is better
because I can see the overall shape of the values and the positions in the
overall body of samples (across the whole day). I can see that the cups being
taken from each pot are not regularly spaced.

But there’s still a lot of noise that doesn’t hit those thresholds, and
importantly, this approach simply throws away the samples that are outside the
bounds. So that means there is a time gap at those points, and if enough of them
happen (I can only see two here), then the time-series is discontinuous.

So I think the approach is not to filter out and discard bad values, it is use
to use the source log data to produce an entirely new stream of weight values
using something like a moving window of averaging. That’s how the firmware does
it and that gives a decent result.


USE A ROLLING() SAMPLE WINDOW TO REMOVE NOISE

There is a rolling_mean() function in pandas:
https://www.programcreek.com/python/example/101378/pandas.rolling_mean. Oh it’s
deprecated. And I don’t want a mean anyway, means are rubbish, I want the
median.
https://stackoverflow.com/questions/43437657/rolling-mean-on-pandas-on-a-specific-column
describes using pandas.rolling().mean() instead, which is closer. I assume
there’s a .median() function too.

from pandas import read_csv
from matplotlib import pyplot

column_names = ['datestamp', 'date', 'time', 'weight']
series = read_csv('../output/datr20190926.csv', names=column_names)
series['rolling'] = series['weight'].rolling(8).mean()
series.plot()
pyplot.show()


That’s a bit more like it. It proves that mean() isn’t the right one (mean
averages are very affected by outliers). So now I need to figure out how to only
plot rolling rather than rolling and the raw weight. I think this is a
matplotlib issue. Well it is, and it isn’t:
http://jonathansoma.com/lede/algorithms-2017/classes/fuzziness-matplotlib/understand-df-plot-in-pandas/


USING ROLLING().MEDIAN()

from pandas import read_csv
from matplotlib import pyplot

column_names = ['datestamp', 'date', 'time', 'weight']
series = read_csv('../output/datr20190926.csv', names=column_names)
series['rolling'] = series['weight'].rolling(8).median()
series.plot(x='time', y='rolling')

# Rotate x ticks and tight_layout fits it all on the page
pyplot.xticks(rotation='vertical')
pyplot.tight_layout()
pyplot.show()

That’s more like it. It uses .median() instead of mean to deal better with
outliers, and also I figured out series.plot(x='time', y='rolling') to specify
which axes to use, rotate the time ticks so they don’t overlap, and
tight_layout()‘d it so they didn’t fall off the bottom of the page.

This calculates the median value in a rolling window of 8 samples, so that’s
about sixteen seconds. I’m interested in filtering out some more of the jaggies,
and would like to see the results of a few different versions plotted together.
Doing like below was a guess and it worked. It makes me start to think I’m
getting a bit of a clue on how to use this toolset. Four hours in.


SEE THE EFFECTS OF DIFFERENT SIZED ROLLING WINDOWS

series['rolling4'] = series['weight'].rolling(4).median()
series['rolling8'] = series['weight'].rolling(8).median()
series['rolling12'] = series['weight'].rolling(12).median()
series['rolling24'] = series['weight'].rolling(24).median()
series.plot(x='time', y=['rolling4', 'rolling8', 'rolling12', 'rolling24'])

What was that weird blip at 3am? So I know that 24 samples covers 48 seconds of
activity and filters out almost all variance. It shows the gulp-gulp-gulp of the
coffee cups down (8:45 to 11:41) which is really cool and really clear. But I
realised that’s not quite what I thought I was looking for.

This tells a story and at first I thought it filtered out too much. It doesn’t
show the sequence of events of lifting the coffeepot out->returning the pot and
it’s a bit lighter. Because the pot is out for a short period of time it just
disappears. I thought that this little signature would be important to be able
to recognise a cup of coffee being taken.

In fact, I think the heavily-filtered plot, with the simple chunky downward
steps tells the story more clearly. If I could spot a drop of around a cup, then
that’s simple.

I’ve just realised that the matplotlib viewer that pops up when I do
pyplot.show() is really good. It does a zoom-in on a section! That’s exactly
what I wanted excel to do for me. Excel really is the wrong tool for this job.


ADDING MORE TICKS TO THE X-AXIS

It does not show me very good ticks on the x-axis though, there aren’t enough.
I’ll need to add some more.

from pandas import read_csv
from matplotlib import pyplot

column_names = ['datestamp', 'date', 'time', 'weight']
series = read_csv('../output/datr20190926.csv', names=column_names)
small_window = ('rolling4', 4)
large_window = ('rolling36', 36)

series[small_window[0]] = series['weight'].rolling(small_window[1]).median()
series[large_window[0]] = series['weight'].rolling(large_window[1]).median()

series.plot(x='time', y=[small_window[0], large_window[0]])
series.set_index('time')

count = series['time'].count()
no_of_ticks = 24
size_of_segment = int(count / no_of_ticks)

# two lists, one is a list of indices regularly spread out across the series
indices = list()
tick_labels = list()
for i in range(0, no_of_ticks):
    position = (size_of_segment * i) + 42/2
    indices.append(position)
    tick_labels.append(series['time'][position])

pyplot.grid(True, which='major')
pyplot.xticks(labels=tick_labels, ticks=indices, rotation='vertical')
pyplot.tight_layout()
pyplot.show()

That seems like a clunky way to do it. There must be a better one! That’s enough
for today. Six hours work. Bit of Overwatch before bed.

This entry was posted in Uncategorized and tagged analysis, coffee-boss, esp32,
matplotlib, pandas on September 29, 2019 by sandy.


POST NAVIGATION

← Older posts

Search for:


RECENT POSTS

 * Coffee boss day 10 – Neat board and RFID
 * Coffee boss day 9 – A simpler current arrangement and combining sensors
 * Coffee boss day 8 – How to sense the current
 * Coffee boss day 7 – Moar sensorz
 * Coffee Boss day 6.5: Combining a scatter with a line


RECENT COMMENTS


ARCHIVES

 * December 2019
 * November 2019
 * October 2019
 * September 2019
 * February 2018
 * April 2016
 * March 2016
 * May 2015
 * June 2014
 * May 2014
 * June 2013
 * March 2013
 * February 2013


CATEGORIES

 * computers
 * project
 * site
 * Uncategorized


META

 * Log in
 * Entries feed
 * Comments feed
 * WordPress.org

Proudly powered by WordPress