blog.andrewprendergast.com
Open in
urlscan Pro
2a00:1450:4001:810::2013
Public Scan
Submitted URL: http://andrewprendergast.com/
Effective URL: https://blog.andrewprendergast.com/
Submission: On October 27 via api from US — Scanned from DE
Effective URL: https://blog.andrewprendergast.com/
Submission: On October 27 via api from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
Your browser doesn't appear to support the HTML5 canvas element. AP ON COMPUTERSCIENCEFUTURES SECURING THE FUTURE OF COMPUTER SCIENCE. Optimised for desktop first. TUESDAY, 14 MARCH 2023 CURRENT STATE OF WEBGL / GLSL If you are paying attention close enough, you might have noticed that in the background of this blog is a very simple and strightforward wave simulation using GLSL or WebGL. It has specifically been written in accordance with the OpenGL Shading Language, Second Edition [1,2] text in mind so it is compatible with every single WebGL implementation ever conceived, and is almost completely decoupled from the browser. It has some intentional 'glitch' in it, which is a reference to the analog days of Sutherland's work. Specifically, the simulation uses a textbook implementation of GLSL, as follows: * GLSL 1.2 * OpenGL ES 2.0 * WebGL 1.0 The only coupling to the browser is the opening of a GL context, and if one clicks on the animation in the right place, an "un-project" operation that unwinds the Z-division takes place so that the fragment that underlies the mouse cursor can be calculated and the scene can be rotated using a very primitive rotation scheme which includes gimble lock (no quarternions here!). Both are extrordinarilly simple 3D graphics operations that should not affect rendering at all and is the absolute minimum level of coupling one might expect. In short, it is the perfect test of the most basic of WebGL capability. The un-project operation is written with the minimal amount of code required and uses a little linear algebra trick to do it very efficiently. Feel free to inspect the code to see how it's done. CURRENT STATE Update 26-Jun-23: After much trial and error, and testing on many many devices, I now have successfully isolated three separate WebGL bugs. Now that I have the three bugs properly isolated, I'm starting the writeup and hope to submit the following bug reports in the next week or two, as follows: 1. Incorrect rendering of WebGL 1.0 scene in Google Chrome 2. WebGL rendering heisenbug causes GL context to crash in Chrome after handling some N fragments 3. WebGL rendering heisenbug causes GL context to incorrectly render scene after handling some N fragments TBC... The current state as at 14-March-2023 is that Chrome and other browsers are not able to run this animation for more than 24 hours without crashing, and on the latest versions of Chrome released early March, the animation has now slowed down to ridiculous FPS levels. Previously the animation ran at well over 30 FPS on most devices, but would crash after 24 hours. This animation will quite happily run on an old iPad running Safari, however Chrome currently seems to be struggling. The number of vertices and the the number of sin() operations that it needs to calculate is well within the capabilities of all modern processors, including those found in i-devices such as phones and tablets on which one can play a typical game. Example of correct rendering on all modern browsers including Safari on iPad Example of incorrect rendering on Chrome 111.0.5563.111 (64-bit) as at 23-Mar-23 Brave Browser (based on Chromium) renders correctly, but is horrendously slow in the GA branch (Beta currently works fine). I'm not sure if this is a Linux vs. Windows issue or discrete vs. embedded GPU at this stage, will investigate further when I have time. NB. As at 14-March-2023, on Brave Browser 1.49.120 on Windows with a Discrete GPU the simulation struggles to render 5 FPS, and on Brave Browser 1.50.85 (beta) on Linux with an embedded GPU it works OK, but I can point to other vizualisation artefacts elsewhere that 1.50.85 cannot handle (but which previous versions of Brave/Chrome could), for example, on the homepage of vizdynamics.com, the Humanized Data Robot should gently move around the screen with a slight parallax effect if one moves the mouse over it. Why is the rendering engine in Chrome suddenly regressing, and why is it not using the GPU? This wave simulation should be able to run in it's entirety on SIMD architecture and the Humanized Data Robot used to be rendering flawlessly. What is going on? At 30 FPS, the wave simulation requires around 75 MFLOPS of processing power. To put that into perspective, the first Sun Microsystems SPARC Station released in 1989 was able to calculate 16.2 MIPS (similar to MFLOPS), and the SPARC Station 2 (released 1991) could calculate nearly 30 MIPS. That was over 30 years ago, and a SPARC Station 2 machine had enough compute power that it could happily calculate the same wave simulation at around 10-15 FPS without vizualising it, but actually at 2-5 FPS once one implements a GPU pipeline (thank god SGI bought out IRIS GL). I still have my copy of the original OpenGL Programming Guide (1st edition, 6th printing) that came with my Silicon Graphics Indy workstation. It was a curious book, and I implemented my first OpenGL version of these wave simulations in 1996 according to it, so I'm quite familiar with what to expect. An Indy could handle this - with a bit of careful tuning - quite well. The hardness of this vizualisation is the tremendous number of sin() operations, and the cosine()s used to take their derivative, so it really does test the compute power of a graphics pipeline quite well - if the machine or the implementation isn't up to it, these calculations will bring it to it's knees quite quickly. Fast-forward to 2023, and a basic i7 cannot run the simulation at 30FPS! To put things into perspective, a 2010 era Intel i7 980 XE is capable of over 100 GFLOPS (about 1000x more processing power than whats required to do 75 MFLOPS), and that's without engaging any discrete or integrated SIMD GPU cores. Simply put, the animation in the background of this blog should be trivial for any computing device available today, and should run without interruption. Lets see how well things progress through March and if things improve. REFERENCES [1] Rost, Randi J., and John M. Kessenich. OpenGL Shading Language. 2nd ed. Addison-Wesley, 2006. OpenGL 2.0 + GLSL 1.10 [2] Shreiner, Dave. OpenGL Programming Guide the Official Guide to Learning OpenGL, Version 2. 5th ed. Upper Saddle River, NJ u.a: Addison-Wesley, 2006. Posted by Andrew (AP) Prendergast at 5:15 pm 2 comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest SATURDAY, 4 MARCH 2023 YOU ARE ALL IN A VIRTUAL I love Urban Dictionary - watch out for my @CompSciFutures updates. Here's one I posted today: This is actually a commentary on attack surfaces, specifically that back in the day, one's lead architect knew the entirety of a system's attack surface and would secure it appropriately. Today, systems are so complex, that no single man knows the entire intricacies of the attack surface for even just a small web application with an accompanying mobile app. The security implications of this are profound, hence the reason why I am writing a textbook on the topic. Link to more Urban Dictionary posts in the footer. Posted by Andrew (AP) Prendergast at 11:02 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest TUESDAY, 21 FEBRUARY 2023 SOFTWARE ENGINEERING MANUAL OF STYLE, 3RD EDITION My apologies to everyone I was supposed to follow up with in January - I've been writing a textbook. I'll get back to you late February/early March, I'm locking down and getting this done so we can address the systemic roots of this ridicuous cyber security problem we have all found ourselves in. The book is called: The Software Engineering Manual of Style, 3rd Edition A secure by design, secure by default perspective for technical and business stakeholders alike. The textbook is 120 pages of expansion on a coding style guide I have maintained for over 20 years and which I hand to every engineer I manage. The previous version was about 25 pages, so this edition is a bit of a jump! SECURE-BY-DESIGN, SECURE-BY-DEFAULT SOFTWARE ENGINEERING. THE HANDBOOK. It covers the entirety of software engineering at a very high level, but has intricate details of information security baked into it, including how and why things should be done a certain way to avoid building insecure technology that is vulnerable to attack. Not just tactical things, like avoiding 50% of buffer overruns or most SQL injection attacks (and leaving the rest of the input validation attacks unaddressed). This textbook redefines the entire process of software engineering, from start to finish, with security in mind from page 1 to page 120. Safe coding and secure programming are not enough to save the world. We need to start building technology according to a secure-by-design, secure-by-default software engineering approach, and the world needs a good reference manual on what that is. This forthcoming textbook is it. LATEST EXCERPTS 21-Feb-23 Excerpt: The Updated V-Model of Software Testing (DOI: 10.13140/RG.2.2.23515.03368) 21-Feb-23 Excerpt: The Software Engineering Standard Model (DOI: 10.13140/RG.2.2.23515.03368) EDIT 22-Mar-23: Proof showing that usability testing is no longer considered non-functional testing (DOI: 10.13140/RG.2.2.23515.03368) EDIT 22-Mar-23: The Pillars of Information Security, The attack surface kill-switch riddle + The elements of authenticity & authentication (DOI: 10.13140/RG.2.2.12609.84321) EDIT 22-Apr-23: The revised Iterative Process of Modelling & Decision Making (DOI: 10.13140/RG.2.2.11228.67207/1) EDIT 18-May-23: The Lifecycle of a Vulnerability (DOI: 10.13140/RG.2.2.23428.50561) AUDIENCE I'm trying to write it so it's processes and methodologies: * Can be baked into a firm by CXOs using strategic management principles; or * embraced directly by engineers and their team leaders without the CEOs shiny teeth and meddlesome hands getting involved. Writing about very technical matters for both audiences is hard and time consuming, but I think I'm getting the hang of it! ABSTRACT FROM THE COVER PAGE The foreword/abstract from the first page of the text reads as follows: "The audience of this textbook is engineering based, degree qualified computer science professionals looking to perfect their art and standardise their methodologies and the business stakeholders that manage them. This book is not a guide from which to learn software engineering, but rather, offers best practices canonical guidance to existing software engineers and computer scientists on how to exercise their expertise and training with integrity, class and style. This text covers a vast array of topics at a very high level, from coding & ethical standards, to machine learning, software engineering and most importantly information security best practices. It also provides basic MBA-level introductory material relating to business matters, such as line, traffic & strategic management, as well as advice on how to handle estimation, financial statements, budgeting, forecasting, cost recovery and GRC assessments. Should a reader find any of the topics in this text of interest, they are encouraged to investigate them further by consulting the relevant literature. References have been carefully curated, and specific sections are cited where possible." The book is looking pretty good: it is thus far what it is advertised to be. HELPING OUT AND DONATING The following link will take you to a LinkedIn article that I am publishing various pre-print extracts (some are also published above). If you are in the field of computer science or software engineering, you might be able to help by providing some peer-review. If not, there is a link to an Amazon booklist that you can also contribute to this piece of work by donating a book or two. Visit the book's homepage on LinkedIn » And feel free just to take a look and see where we're going and what's being done to ensure that moving forward, we stop engineering such terribly insecure software. Any support to that end would be most appreciated. Edited 22-Mar-23: added usability testing proof Edited 22-Mar-23: added Pillars of Cybersecurity Posted by Andrew (AP) Prendergast at 12:03 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest SUNDAY, 1 JANUARY 2023 VIZLAB 2.0 CLOSURE VizDynamics is still a trading entity, but VizLab 2.0 is now closed. The lack of attention to cybersecurity by state actors, big tech and cloud operators globally made it impossibly difficult to continue operating an advanced computer science lab with 6 of Melbourne's best computer scientists supporting corporate Australia. We could have continued, but we saw this perfect storm of cyber security coming and decided to dial down VizLab starting in 2017. Given recent cyber disasters, it is clear we made the right decision. State actors, cloud operators and big tech need to be careful with “vendor backdoor” legislation such as key escrow of encryption, because this form of 'friendly fire' hides the initial attack vector and the intial point of network contact when trying to forensically analyse and close down real attacks. Whilst that sort of legislation is in place without apporpriate access controls, audit controls, detective controls, cross-border controls, kill switches and transparency reporting, it is not commercially viable for us to operate a high powered CS lab, because all the use cases corporate Australia want us to solve involve cloud-based PII, for example, ‘Prediction Lakes’. A CS LAB DESIGNED FOR PAIRED PROGRAMMING Part MIT Media lab, part CMU Autonlab, part vizualistion lab, part paired-programming heaven: this was VizLab 2.0. VizLab doorway with ingress & egress Sipher readers, Inner Range high frequency monitoring & enterprise class CCTV. A secure site in a secure site. A BSOD in The Lab: The struggles of WebGL and 3D everywhere. Note the 3-screen 4k workstation in the foreground designed for paired local + remote programming. The Lab - Part vizualisation, part data immersion. VizLab 2.0. Note the Eames at the end of the centre aisle. Each workstation was setup for paired programming. 6 workstations, where you could plug in 2 keyboards, 2 mice, 2 chairs side-by-side and enough room so that you weren't breathing on eachother with returns either side big enough for a plethora of academic texts, client specs and all the notes you could want. With 2 HDMI cables hung from the roof linking to projectors on opposing walls that could reach any workstation at any moment, this was an environment for working on hard things; collaboratively, together. Note the lack of client seating. They would have to perch on the edge of a desk and see everything, or an Eames lounge at the end of the room and try to see everything, or a White real leather Space Furniture couch next to a Tom Dixon and see nothing - we took host to management teams from a plurality of ASX200s, and the first thing that struck them — we didn't have seating for them, because the next few hours they were going to be moving around and staring at walls, computers, the ceiling — there simply was nowhere to sit, when a team of computer scientists, trained in computer graphics and very deft with data science were taking them on a journey into Data. THE NOW: VIZLAB 3.0 – THE FUTURE: VIZLAB 4.0 AP is still around and is spending the most part of 2023 writing a textbook on secure software engineering, and we've setup a smaller two-man VizLab 3.0 for cyber defence research mainly around GRC assessment and computer science education both at a secondary and a tertiary level. AP is currently doing research in that field so we can hopefully reduce cyber-risk down to a level that is acceptable to coprorate Australia by increasing the mathematical and cyber-security awareness and literacy of computer science students as they enter into university and then industry. If we can help to create that environment, then VizLab 4.0 may materialise and will be bigger and better, but because we dialed back our insurances (it's not practical to be paying $25K pa while we're doing research), we aren't in a position to provide direct consultation services at the moment. VizDynamics is still a trading entity, and a new visualisation based Information Security brand might be launched somewhere in 2024 based on the “Humanizing Data” vizualisation thesis (perhaps through academia or government – we’re not sure yet). Fixes are happening to WebGL rendering engines, and slowly cyber security awareness is rising to the top of the agenda, so our work is slowly shifting the needing and we’re moving in the right direction. If you want to keep track of what AP's up to, vizit blog.andrewprendergast.com. Posted by Andrew (AP) Prendergast at 9:11 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest SUNDAY, 4 DECEMBER 2022 THE PATH TO BUILDING STARK-TREK STARSHIPS Posted by Andrew (AP) Prendergast at 10:30 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest Location: San Francisco, CA, USA SATURDAY, 5 NOVEMBER 2022 THE MIND BLOWING HARBINGER OF WIRED 1.01 Wired magazine 1.01 was published in 1993 by Nicholas Negreponte & Louis Rosetto. Every issue contained inside the front cover a 'Mind Grenade', and the one from the very first issue (1.01) is -- in hindisght -- creepy. Here it is: The 'mind grenade' from Wired 1.01, Circa March 1993. Damn Professor Negroponte, you ring truer every day. Posted by Andrew (AP) Prendergast at 3:19 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest FRIDAY, 21 OCTOBER 2022 RECOMMENDER SYSTEMS Recommender systems are huge outside of Australia and USA such that most marketing managers now consider their optimisation as important as Search Engine Marketing (SEM). I can't believe we have totally missed the ball on this one, and nobody on the other side of planet, from Dubai to London has bothered to tell us! Anwyays, here's the original seminal paper that Andreas Wiegend (ex Stanford, market genius and inventor of Prediction Markets and The BRIC Bank, Chief Scientist Emeritus of Amazon.com and inventor of recommender systems) directed and promoted this paper. It's based on proper West Coast Silicon Valley AI, with a quality discussion about a number of related techologies and market related effects that impact recommender systems. Enjoy! Posted by Andrew (AP) Prendergast at 3:36 am No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest SUNDAY, 17 JULY 2022 I LOVE FOURIER DOMAIN I've been playing with building a Swarm Intelligence simulator based on a fourier domain discretisation to schedule the placement of drones in 3D space and cars in 2D space. Here's a little video demo of it's basic structure in action, on top of this is some differential equations to capture the displacement field, then drone position coords: LinkedIn post with a video demo of the simulator in structural mode. You need to be logged into LinkedIn to see the post. If you want to have a play with this class of sine wave, you might notice a simpler simulation in the background of this blog. It has a few extra features not normally seen of these types of simulation: instead of a single point being able to move along one axis (usually the Y-axis), every point in my simulation can move anywhere along the X, Y or Z axis. Take a look yourself, left-click and drag the mouse on the background (where the 3D simulation is happening) to rotate the simulation in realtime. Look below the surface to see the mesh, above it and you get a flat view. For best effect, try full-screen browser, remove all content and view just the background wave simulation. Posted by Andrew (AP) Prendergast at 4:14 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest SUNDAY, 31 MARCH 2019 MY FAVOURITE VIZ OF ALL TIME HOW GOOGLE USED VIZUALISATION TO BECOME ONE OF THE WORLDS MOST VALUABLE COMPANIES At VizDynamics we have done a lot of 'viz'-ualisation, so I’ve seen more than several life-times worth of dashboards, reports, KPIs, models, metrics, insights and all manner of presentation and interaction approaches thereof. YET ONE VIZ HAS ALWAYS STUCK IN MY MIND. More than a decade ago when I was post start-up exit and sitting out a competitive-restraint clause, I entertained myself by travelling the world in search of every significant thought leader and publication about probabilistic reasoning that I could find. Some were very contemporary; others were most ancient. I tried to read them all. A much younger me @ the first Googleplex (circa 2002) Some of this travelling included regular visits to Googleplex 1.0, back before they floated and well before anyone knew just how much damn cash they were making. As part of these regular visits, I came across a viz at the original ‘Plex that blew me away. It sat in a darkened hall in a room full of engineers on a small table at the end of a row of cubicles. On this little IBM screen was an at-the-time closely guarded viz: The "Live Queries" vizualisation @ Googleplex 1.0 Notice the green data points on the map? They are monetised searches. Notice the icons next to the search phrases? More “$” symbols meant better monetisation. This was pre-NPS, but the goal was the same – link $ to :) then lay it bare for all to see. WHAT MAKES THIS UNASSUMING VIZ SO GOOD? It's purpose. Guided by Schmidt’s steady hand, Larry & Sergey (L&S) had amassed the brainpower of 300+ world leading engineers, then unleashed them by allowing them to work independently. They now needed a way for them to self-govern and -optimise their continual improvements to product & revenue whilst keeping everyone aligned to Google's users-first mantra. The solution was straightforward: use vizualisation to bring the users into the building for everyone to see, provide a visceral checkpoint of their mission and progress, and do it in a humanely digestible manner. Simple in form & embracing of Tufteism, the bottom third of the screen scrolled through user searches as they occurred, whilst the top area was dedicated to a simple map projection showing where the last N searches had originated from. An impressively unpretentious viz that let the Data talk to one’s inner mind. The pictograph in the top section was for visual and spatially aware thinkers, under that was tabular Data for the more quantitative types. And there wasn’t a single number or metric in sight (well not directly anyway). Three obviously intentional design principles executed well. More than just a Viz, this was a software solution to a plurality of organizational problems. To properly understand the impact, imagine yourself for a moment as a Googler, briskly walking through the Googleplex towards your next meeting or snack or whatever. You alter your route slightly so you can pass by a small screen on your way through. The viz on the screen: * instantly and unobtrusively brought you closer to your users, * persistently reminded you and the rest of the (easily distracted) engineers to stay focused on the core product, * provided constant feedback on financial performance of recent product refinements, and * inspired new ideas before you continued down the hall. THE BEST VIZUALISATIONS HUMANISE DIFFICULT DATA IN A VISCERAL WAY This was visual perfection because it was relevant to everyone, from L&S down to the most junior of interns. Every pixel served a purpose, coming together into an elegantly simple view of Google's current state. Data made so effortlessly digestible that it spoke to one’s subconscious mind with only a passing glance. A viz so powerful that it helped Google to become one of the world’s most valuable companies. This was a portal into people's innermost thoughts and desires as they were typing them into Google. All this... on one tiny little IBM screen, at the end of a row of cubicles. Posted by Andrew (AP) Prendergast at 1:29 am No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest Location: Sunnyvale, Silicon Valley, CA, USA THURSDAY, 1 JUNE 2017 ACLAND STREET – THE GRAND LADY OF STKILDA Acland Street is the result of two years of research. As well as extended archival and social media research, more than 150 people who had lived, worked, and played in Acland Street were interviewed to reveal its unique social, cultural, architectural, and economic history. Of course we got a mention on page 133: Note the special mention under 'The Technology', page 133. Circa 1995, published 2017. Posted by Andrew (AP) Prendergast at 4:38 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest Location: Acland St, Melbourne, Australia TUESDAY, 1 SEPTEMBER 2015 CXO LEADERS SUMMIT Posted by Andrew (AP) Prendergast at 3:02 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest Location: Sydney NSW, Australia THURSDAY, 30 OCTOBER 2014 INTRO TO BAYESIAN REASONING LECTURE Here's a quick one to make the files available online from today's AI lecture at RMIT University. Much thanks to Lawrence Cavedon for making it happen. DOWNLOADS Lecture Notes (PDF) Course/grade/intelligence plate model example (Netica) Output from sneezing diagnosis class exercise (Netica) Burgular hidden markov model example (Netica) Same burgular HMM in Excel (Excel) Have fun and feel free to email me once you get your bayes-nets up and running! Posted by Andrew (AP) Prendergast at 7:07 am No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest Location: RMIT University, Melbourne, Australia THURSDAY, 16 OCTOBER 2014 ACCESSING DATA WAREHOUSES WITH MDX RUNNER It's always good to give a little something back, so each year I do some guest lecturing on data warehousing to RMIT's CS Masters students. We usually pull a data warehouse box out of our compute cloud for the session so I can walk through the whole end-to-end stack from the hardware through to the dashboards. The session is quite useful and always well received by students. This year the delightful Jenny Zhang and I showed the students MDX Runner, an abstraction used at VizDynamics on a daily basis to access our data warehouses. As powerful as MDX is, it has a steep learning curve and the result sets it returns can be bewildering to access programmatically. MDX Runner eases this pain by abstracting out the task of building and consuming MDX queries. Given that it has usefulness far beyond what we do at VizDynamics, I have made arrangements for MDX Runner to be open-sourced. If you are running analysis services or any other MDX-compatible data warehousing environment, take a look at mdxrunner.org - you will certainly find it useful. Do reach out with updates if you test it against any of the other BI platforms. Hopefully over time we can start building out a nice generalised interface into Oracle, Teradata, SAP HANA and SSAS. Posted by Andrew (AP) Prendergast at 3:58 pm 2 comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest Location: RMIT University, Melbourne, Australia SATURDAY, 13 SEPTEMBER 2014 BIDDING: AXIOMS OF DIGITAL ADVERTISING + TRAFFIC VALUATION In this post I share a formal framework for reasoning about advertising traffic flows, is how black box optimisers work and needs to be covered before we get into any models. If you are a marketer, then the advertising stuff will be old-hat and if you are a data scientist then the axioms will seem almost obvious. What is useful is combining this advertising + science view and the interesting conclusions about traffic valuation one can draw from it. The framework is generalised and can be applied to a single placement or to an entire channel. CREATIVE VS. DATA – WHO WILL WIN? I should preface by saying my view on creative is that it is more important than the quality of one's analysis and media buying prowess. All the data crunching in the world is not worth a pinch if the proposition is wrong or the execution is poor. On the other hand, an amazing ad for a great product delivered to just the right people at the perfect moment will set the world on fire. PROBLEM DESCRIPTION The digital advertising optimisation problem is well known: analyse the performance data collected to date and find the advertising mix that allocates the budget in such a way that maximises the expected revenue. This can be divided into three sub-problems: assigning conversion probabilities to each of the advertising opportunities; estimating the financial value of advertising opportunities; and finding the Optimal Media Plan. The most difficult of these is the assessment of conversion probabilities. Considering only the performance of a single placement or search phrase tends to discard large volumes of otherwise useful data (for example, the performance of closely related keywords or placements). What is required is a technique that makes full use of all the data in calculating these probabilities without double-counting any information. THE HOLY TRIUMVIRATE OF DIGITAL ADVERTISING In most digital advertising marketplaces, forces are such that traffic with high conversion probability will cost more than traffic with a lower conversion probability (see Figure 1). This is because advertisers are willing to pay a premium for better quality traffic flows while simultaneously avoiding traffic with low conversion probability. Digital advertising also possesses the property that the incremental cost of traffic increases as an advertiser purchases more traffic from a publisher (see Figure 2). For example, an advertiser might increase the spend on a particular placement by 40%, but it is unlikely that any new deal would generate an additional 40% increase in traffic or sales. Figure 1: Advertiser demand causes the cost of traffic to increase with conversion probability Figure 2: Publishers adjust the cost of traffic upward exponentially as traffic volume increases To counter this effect, sophisticated marketers grow their advertising portfolios by expanding into new sites and opportunities (by adding more placements), rather than by paying more for the advertising they already have. This horizontal expansion creates an optimisation problem: given a monthly budget of $x, what allocation of advertising will generate the most sales? This configuration then is the Optimal Media Plan. Figure 3: The Holy Triumvirate of Digital Advertising: Cost, Volume and Propensity NB: There are plenty of counterexamples when this response surface is observed in the wild. For example with Figure 2, some placements are more logarithmic than exponential, while others are a combination of the two. A good agency spends their days navigating and/or negotiating this so that one doesn't end up over paying. To solve the Optimal Media Plan problem, one needs to know three things for every advertising opportunity: the cost of each prospective placement; the expected volume of clicks; and the propensity of the placement to convert clicks into sales (see Figure 3). This Holy Triumvirate of Digital Advertising (cost, volume and propensity) is constrained along a response surface that ensures that low cost, high propensity and high volume placements occur infrequently and without longevity. For the remainder of this post (and well into the future), propensity will be considered exclusively in terms of ConversionProbability. This post will provide a general framework for this media plan optimisation problem and explore how ConversionProbability relates to search and display advertising. Read more » Posted by Andrew (AP) Prendergast at 3:13 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest SATURDAY, 22 MARCH 2014 A GRAND THESIS Oh dear, the game is up. Our big secret is out. We should have a parade. THE FUTURE OF MODERNITY This year is looking like when computer scientists come out and confess that the world is undergoing a huge technology driven revolution based on simple probabilities. Or perhaps it's just that people have started to notice the rather obvious impact it is making on their lives (the hype around the recent DARPA Robotics Challenge and Christine Lagarde's entertaining lecture last month are both marvelous example of that). This change is to computer science what quantum mechanics was to physics: a grand shift in thinking from an absolute and observable world to an uncertain and far less observable one. We are leaving the digital age and entering the probabilistic one. The main architects of this change are some very smart people and my favorite super heroes - Daphne Koller, Sebastian Thrun, Richard Neapolitan, Andrew Ng and Ron Howard (no not the Happy Days Ron – this one). Behind this shift are a clique of innovators and ‘thought leaders’ with an amazing vision of the future. Their vision is grand and they are slowly creating the global cultural change they need to execute it. In their vision, freeways are close to 100% occupied, all cars travel at maximum speed and the population growth declines to a sustainable level. This upcoming convergence of population to sustainable levels will not come from job-stealing or killer robots, but from increased efficiency and the better lives we will all live, id est, the kind of productivity increase that is inversely proportional to population growth. And then world is saved... by computer scientists. WHAT IS IT SORT OF EXACTLY-ISH? Classical computer science is based on very precise, finite and discrete things, like counting pebbles, rocks and shells in an exact manner. This classical science consists of many useful pieces such as the von-neumann architecture, relational databases, sets, graph theory, combinatorics, determinism, greek logic, sort + merge, and so many other well defined and expressible-in-binary things. What is now taking hold is a whole different class of computer-ey science, grounded in probabilistic reasoning and with some other thing called information theory thrown in on the sidelines. This kind of science allows us to deal in the greyness of the world. Thus we can, say, assign happiness values to whether we think those previously mentioned objects are in fact more pebbly, rocky or shelly given what we know about the time of day and its effect on the lighting of pebble-ish, rock-ish and shell-ish looking things. Those happiness values are expressed as probabilities. The convenience of this probability-based framework is its compact representation of what we know, as well as its ability to quantify what we do not(ish). Its subjective approach is very unlike the objectivism of classical stats. In classical stats, we are trying to uncover a pre-existing independent, unbiased assessment. In the Bayesian or probabilistic world bias is welcomed as it represents our existing knowledge, which we then update with real data. Whoa. This paradigm shift is far more general than just the building of robots - it's changing the world. I SHALL NOW SHOW YOU THE EVIDENCE SO YOU MAY UPDATE YOUR PROBABILITIES A testament to the power of this approach is that the market leaders in many tech verticals already have this math at their heart. Google Search is a perfect example - only half of their rankings are PageRank based. The rest is a big probability model that converts your search query into a machine-readable version of your innermost thoughts and desires (to the untrained eye it looks a lot like magic). If you don’t believe me, consider for a moment, how does Facebook choose what to display in your own feed? How do laptops and phones interpret gestures? How do handwriting, speech and facial recognition systems work? Error Correction? Chatbots? Emotion recognition? Game AI? PhotoSynth? Data Compression? It’s mostly all the same math. There are other ways, which are useful for some sub-problems, but they can all ultimately be decomposed or factored into some sort of Bayesian or Markovian graphical probability model. Try it yourself: Pickup your iPhone right now and ask the delightful Siri if she is probabilistic, then assign a happiness value in your mind as to whether she is. There, you are now a Bayesian. APAC IS MISSING OUT Notwithstanding small pockets of knowledge, we don’t properly teach this material in Australia, partly because it is so difficult to learn. We are not alone here. Japan was recently struck down by this same affliction when their robots could not help to resolve their Fukushima disaster. Their classically trained robots cannot cope with changes to their environment that probabilities so neatly quantify. To give you an idea of how profound this thesis is, or how far and wide it will eventually travel, it is currently taught by the top American universities across many faculties. The only other mathematical discipline that has found its way into every aspect of science, business and humanities is the Greek logic, and that is thousands of years old. A NEAT MATHEMATICAL MAGIC TRICK The Probabilistic Calculus subsumes Greek Logic, Predicate Logic, Markov Chains, Kalman Filters, Linear Models, possibly even Neural Networks; that is, because they can all be expressed as graphical probability models. Thus logic is no longer king. Probabilities, expected utility and value of information are the new general purpose ‘Bayesian’ way to reason about anything, and can be applied in a boardroom setting as effectively as in the lab. One could build a probability model to reason about things like love, however it's ill advised. For example, a well-trained model would be quite adept at answering questions like “what is the probability of my enduring happiness given a prospective partner with particular traits and habits.” The ethical dilemma here is that a robot built on the Bayesian Thesis is not thinking as we know it – it's just a systematic application of an ingenious mathematical trick to create the appearance of thought. Thus for some things, it simply is not appropriate to pretend to think deeply about a topic; one must actually do it. WE NEED BANDWIDTH OR WE WILL DEVOUR YOUR 4G NETWORK WHOLE These probabilistic apps of the future (some of which already exist) will drive bandwidth-hogging monsters (quite possibly literally) that could make full use of low latency fibre connections. These apps construct real-time models of their world based on vast repositories of constantly updated knowledge stored ‘in the cloud’. The mechanics of this requires the ability to transmit and consume live video feeds, whilst simultaneously firing off thousands of queries against grand mid- and big-data repositories. For example, an app might want to assign probabilities to what that shiny thing is over there, or if its just sensor noise, or if you should buy it, or if you should avoid crashing into it, or if popular sentiment towards it is negative; and, oh dear, we might want to do that thousands of times per second by querying Flickr, Facebook and Google and and and. All at once. Whilst dancing. And wearing a Gucci augmented reality headset, connected to my Hermes product aware wallet. This repetitive probability calculation is exactly what robots do, but in a disconnected way. Imagine what is possible once they are all connected to the cloud. And then to each other. Then consider how much bandwidth it will require! But, more seriously, the downside of this is that our currently sufficient 4G LTE network will very quickly be overwhelmed by these magical new apps in a similar way to how the iPhone crushed our 3G data networks. Given that i-Devices and robots like to move around, I don't know whether FTTH would be worth the expense, but near-FTTH with a very high performance wireless local loop certainly would help here. At some point we will be buying Hugo Boss branded Occulus Rift VR headsets, and they need to plug into something a little more substantive than what we have today. AHH OK, WHAT DOES THIS HAVE TO DO WITH ADVERTISING? In my previous post I said I would be covering advertising things. So here it is if you haven't already worked it out: this same probability guff also works with digital advertising, and astonishingly well. There I said it, the secret is out. Time for a parade. ...some useful bits coming in the next post. Posted by Andrew (AP) Prendergast at 12:32 am No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest FRIDAY, 21 MARCH 2014 OH HAI Fab, I’m blogging. A CHUMP’S GAME A good friend of mine, whilst working at a New York hedge fund once said to me, “online advertising is a chump’s game”. At the time he was exploring the idea of constructing financial instruments around the trade of user attention. His comment was coming from just how unpredictable, heterogeneous and generally intractable the quantitative side of advertising can be. Soon after, he quickly recoiled from the world of digital advertising and re-ascended back into the transparent market efficiency of haute finance; a world of looking for the next big “arb”. WHAT I DO I am a data scientist and I work on this problem every day. Over the last 15 or so years I have come to find that digital advertising is, in fact, completely the opposite of a chump's game – yes, media marketplaces are extraordinarily opaque and highly disconnected – but with that comes fantastically gross pricing inefficiencies exploitable in a form of advertising arbitrage. The Wall Street guys saw this, but never quite cracked how to exploit it. WHAT YOU WILL FIND HERE If you have spent more than a little time with me, then in between mountain biking and heli-boarding at my two favorite places in the world, you will have probably heard about or seen a probability model or two. In the coming months I will be banging on about some of this, and in particular sharing a few easy tricks on how advertisers can use data to gain a bit of an advantage. With the right approach, it’s rather simple. The concepts I will present here are already built into the black-box ad platforms we use daily, the foundations of which are two closely related assumptions: * Any flow of advertising traffic has people behind it whom possess a fixed amount of buying power and a more elastic willingness to exercise it. * As individuals we are independent thinkers, but as a swarm, we behave in remarkably predictable ways. My aim is that one will find the material useful with little more than a bit of Excel and one or two free-ish downloads. The approach is carefully principled, elegantly simple and astonishingly effective. Achtung! This site makes use of in-browser 3D. If your computer is struggling, then you probably need a little upgrade, a GPU or a browser change. Modern data science needs compute power, and alot of it. The format is a mix of theory, worked examples and how-to, combined with a touch of spreadsheet engineering. A dear friend of mine – whom has written more than a few articles for the Economist - will be helping me edit things to keep the technical guff to a minimum. I am hoping along the way that a few interesting people might also compare their own experiences and provide further feedback and refinement. If its well received then we might scale up the complexity. So, if digital advertising is your game then stay tuned, this will be a bit of fun! Posted by Andrew (AP) Prendergast at 12:02 am No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest MONDAY, 16 JULY 2007 SORTING DATA FRAMES IN R This is a grandfathered post copied across from my old blog when I was using MovableType (who remembers MovableType?!) I frequently find myself having to re-order rows of a data.frame based on the levels of an ordered factor in R. For example, I want to take this data.frame: product store sales 1 a s1 12 2 b s1 24 3 a s2 32 4 c s2 12 5 a s3 9 6 b s3 2 7 c s3 29 And sort it so that the sales data from the stores with the most sales occur first: product store sales 3 a s2 32 4 c s2 12 5 a s3 9 6 b s3 2 7 c s3 29 1 a s1 12 2 b s1 24 I keep forgetting the exact semantics of how its done and Google never offers any assistance on the topic, so here is a quick post to get it down once and for all, both for my own benefit and the greater good. First we need some data: productSalesByStore = data.frame( product = c('a', 'b', 'a', 'c', 'a', 'b', 'c'), store = c('s1', 's1', 's2', 's2', 's3', 's3', 's3'), sales = c(12, 24, 32, 12, 9, 2, 29) ) Now construct a sorted summary of sales by store: storeSalesSummary = aggregate( productSalesByStore$sales, list(store = productSalesByStore$store), sum) storeSalesSummary = storeSalesSummary[ order(storeSalesSummary$x, decreasing=TRUE), ] storeSalesSummary should look like this: store x 2 s2 44 3 s3 40 1 s1 36 Use that summary data to construct an ordered factor of store names: storesBySales = ordered( storeSalesSummary$store, levels=storeSalesSummary$store ) storesBySales is now an ordered factor that looks like this: [1] s2 s3 s1 Levels: s2 < s3 < s1 Re-construct productSalesByStore$store so that it is an ordered factor with the same levels as storesBySales productSalesByStore$store = ordered(productSalesByStore$store, levels=storesBySales) Note that neither the contents nor the order of productSalesByStore has changed (yet). Just the datatype of the store column. Finally, we use the implicit ordering of store to generate an explicit permutation of productSalesByStore so that we can sort the rows in a stable manner: productSalesByStore = productSalesByStore[ order(productSalesByStore$store), ] And we are done! Posted by Andrew (AP) Prendergast at 2:30 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest TUESDAY, 17 AUGUST 1999 PARALLEL READ/WRITE LOCKING AND MY ORACLE ADVENTURE Back when Oracle 8i was a thing, Fibre Channel was all the rage and on the Oracle roadmap was Oracle Parallel Database, I was called into the Oracle HQ in Redwood City, Silicon Valley. They'd pushed back the release of "Parallel" quarter after quarter and a couple senior engineers caught wind that I was the guy for doing custom built memory memory managers and I was doing big highly parallelised data structures on large SGI platforms. I had one data structure running on 16 racks of Silicon Graphics kit at a large data centre off highway 101 for an investment bank (aka hedge fund), and I'd earned a reputation for being able to do parallel distributted read/write locking of vast data structures with all CPU threads running at full speed and without any mutex locking or blocking. So, I was summoned to Silicon Valley "for a chat". Oracle had this grand idea for Oracle 8i to share fibre channel LUNs between hosts, and your federated database would sit on one LUN with multiple Oracle 8i instances on separate machines all accessing the same database in parallel (hence the name 'Parallel'). Oracle at the time was actively influencing the specs of fibe channel (FCAL), but they just couldn't get it to work -- so I was called in so they could pick my brains. The visit was fun, but I was no dummy, and I certainly wasn't going to give up the secrets to how to build the worlds fastest computing systems. I found the meeting quite entertaining and it descended into an argument over Oracle's outrageous pricing. On a multi-cpu system craylinked together with other multi-cpu systems, why should I pay Oracle a licensing fee for every damn CPU when we had called in Mark Gurry (the guy that wrote the book on Oracle Performance Tuning), tuned the crap out of Oracle so that it barely used a single CPU, maybe two under heavy load. I won the argument and secured special pricing for Oracle moving forward (possibly not what they had intended for our meeting - oh well, that's AP for you!) A much more youthful looking me standing next to the Oracle lake after our meeting Posted by Andrew (AP) Prendergast at 11:20 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest Location: Redwood City, Silicon Valley, CA, USA TUESDAY, 27 JANUARY 1998 GEEKZ ON DEMAND Geekz on Demand (G.O.D) was a HR consultancy I started back in 1997 with Richard Taylor. At the time it was tech boom 1.0, and there was a dearth of talent that properly knew what the Internet was and how to get things onto it. Entre The Geekbase, and it took off like a rocket, landing us in the news quite consistently: Rowan and I on the cover of The Age, 27 Jan 1998. Read more » Posted by Andrew (AP) Prendergast at 8:02 pm No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest THURSDAY, 18 AUGUST 1994 THE START OF THE TECH BOOM I spent tech boom 1.0 putting as much as I could online. PRE-TECH BOOM Not unlike Steve Jobs, I started my tech-career with phone phreaking. Fresh out of the AFP banging on the front door of my Brighton home in 1993 for phone phreaking and hacking (of course, being underage, no conviction was recorded). We were quite prolific and formed a loose nameless collective of the absolute best of the day. We were awash with so many zero day c0dez and exploits that we had to invent "minus days". Feared by the rest of the hacking community, we had the ability to access any computer system or telephone network we desired. Even Julian Assange (aka 'prof') ph3ared us. Our threat-hunting was extrordinary, we learned how to come up with 5 new minus-days exploits in 24 hours, we'd scan 1000 telephone numbers on a bad day, and 10,000 on a good one (basically an entire suburb). We even had a team of computer illiterate guys that had a van and spent their evenings going through corporate rubbish bins looking for manuals and other goodies. Our innovations were countless, ranging from fax bombing telephone numbers (A DOS attack directed towards a voice line/mobile), digital phreaking using diverters and pads, hacking network switches to create conference bridges (or 'party lines' as we called them), drag-net hacking, and we even developed a code of ethics which kept us firmly in the realm of 'explorers' and away from the less desirable labels of 'fraudsters' or 'terrorists'. The Black Hat Code of Ethics * Don't take * Don't break * Don't change * Don't steal (eg, credit card fraud) The general rule was that if you broke one of those rules, then you had to put it right before you were done with the system you had hacked into. We weren't without oversight either. For example, a senior cryptographer from the DSD would dial into our party line and give me and my mates a fortnightly lecture on number theory and cryptography (which at 13 years old is something special that lives with you the rest of your life). This went on for at least 9 months. We also had techs from Telstra watching over our activites as well, because we were basically pen-testing (penetration testing) digital telephony. Occasionally the techs would get in touch and give us a dressing down for some of our activites, so we had boundaries. The DSD guy also found out what was the latest and greatest thing in the black-hat world, so he could keep his finger on-the-pulse. A really chilled, very Australian and very hillarious symbiotic 'free flow of information' type of relationship was shared by all us insiders. Basically, we were a harmless bunch of Aussie bogans doing lots of hysterical phone pranks on a global scale for the most part. For the outsiders though, we were the pinacle of the 'Trust no-one' ethos and were scarry. Very scarry. In the black-hat world, hax0rs would have wars and we won every single one. We became so internationally renowned, that the second ever DEFCON was held right here in Melbourne, and Captain Crunch (aka John Draper) flew out to meet us. THE GOLDEN ERA After that little event, I decided I was done with hax0ring and phreakx0ring and moved to the next big thing: emersing myself in what would become known as 'The Tech Boom', being famous and doing haute-tech. It was so much more fun, and considerably less anti-social. Me on the cover of the Computer Age, circa early 1995. THE NETCAFE We were everywhere, putting anything and everything online, helping anyone and everyone understand what this new Internet thing was. I quickly became the poster boy for the Internet here in Melbourne, and by 1995 we had stood up Australia's first Netcafe on Acland Street. It was a hive of activity, and attracted everyone. I quickly caught the attention of the media, including Wired Magazine, but they didn't have any reporters in Australia. After a couple calls, they did the next best thing: pro sponsorship. The Wired folk didn't usually do sponsorships, but someone at the office called Absolut Vodka and did the next best thing, sending us a crate of 6 Absolut Vodka bottles (including whichever one was on the back cover for that month), along with a fresh, air-freighted current month edition of Wired Magazine, direct to The Netcafe. Back then, Wired was the bible, but it also arrived in Australia 3 months late thanks to it being sea-freighted. Me sitting at a PC in the upstairs room at The Netcafe, featuring Win95 and surrounded by Marcsta artworks Business Review Weekly (BRW), June 1995 Chris Beaumont briefly discussing the Netcafe at an exclusive Melbourne tech-futurist group discussing The Netcafe and mentions moi! An old pic circa 1996 of my first home lab. Note the Beaumont oil painting on the wall and the Absolut Vodka tshirt from being sponsored by Wired Magazine The Netcafe was a special place. With support from Michael Bethune from OzOnline, Adam from Standard Computers and of course Micrcosoft, we created a bunch of rooms above the Deluxe cafe on Acland street that people could experience both The Internet and Windows 95. 95 was a good operating system, it felt like a Silicon Graphics or Sun workstation and it was fast. Everyone that came in we showed it to, taking Melbourne from zero-knowledge about The 'Net to being as educated as anyone in Silicon Valley. It was an amazing time. Even Jeff Kennet humself gave me a little tiny gold badge of Victoria as an acknowledgement of my contribution to the state. BEAT MAGAZINE During this period I crossed paths with Rob Furst, founder of the inky street-press music scene publication Beat Magazine. He quickly employed me as the e-editor, responsible for getting street press mag onto the Internet. I was still a kid at the time, and still studying at Brighton Grammar. How I fitted everything in I don't know, but I did. It was a great time and fun was had by all: List of contributors to Beat Magazine, circa 1995. Note the e-editor :) TED NELSON In 1996 a second luminary from Silicon Valley flew out to meet me -- Theodore ('Ted') Holm Nelson, inventor of hypertext. He heard that something amazing was happening in Australia and he came out here to take a look for himself. We chatted, solved a few problems then he went on his way. I still have his business card today. Posted by Andrew (AP) Prendergast at 1:02 am No comments: Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest Home INTERESTING COMPSCIFUTURES FACTOIDS * PGP-Key: Here's mine, how and why you should too * Unix homedir snippets, Windows gems + priceless Internets artefacts (for me only) * View background wave simulation (click and drag to rotate scene) * delog: JSON/XML logfile deserializer - Converts unstructured CSV data to tabular * My Amazon Wish List - Feel free to donate to my cause and send me a gift! * ProbabilisticLogic.AI ABOUT ME (@COMPSCIFUTURES) Andrew (AP) Prendergast Australian (Neat Theoretical) Computer Scientist, AI expert, cryptgrphr + Information Security Specialist. MSc(CS), RMIT University | Cert. Advanced Cyber Security, Stanford University. Italian, Irish, Anglican Aussie. Possibly Ashken. 🌈 To get in touch: email me at ap@andrewprendergast.com, contact me on LinkedIn, or try @CompSciFutures on Twitter. If you need to open a secure channel, eg, for following up Responsible Disclosure of Vulnerabilities, use my PGP Key (no black-hat stuff, please). View my complete profile COMPSCIFUTURES COPYRIGHT (C) COPYRIGHT 1994 - 2023 Andrew Prendergast. All Rights Reserved. View FOSS licenses AP ON COMPSCIFUTURES BLOG ARCHIVE * ▼ 2023 (4) * ▼ March (2) * CURRENT STATE OF WEBGL / GLSL * YOU ARE ALL IN A VIRTUAL * ► February (1) * ► January (1) * ► 2022 (4) * ► December (1) * ► November (1) * ► October (1) * ► July (1) * ► 2019 (1) * ► March (1) * ► 2017 (1) * ► June (1) * ► 2015 (1) * ► September (1) * ► 2014 (5) * ► October (2) * ► September (1) * ► March (2) * ► 2007 (1) * ► July (1) * ► 1999 (1) * ► August (1) * ► 1998 (1) * ► January (1) * ► 1994 (1) * ► August (1) INFOSEC RESEARCH STATEMENT OF PURPOSE My information security research mostly fits into the strategic areas of awareness & education, cyber defence, early warning systems, threat prevention, risk management and GRC assessment. I do not deal in exploits (as they are munitions), however I do occasionally provide OSINT cyber intelligence on (highly) suspected minus-days critical infrastructure vulnerabilities. SOCIALS Diese Website verwendet Cookies von Google, um Dienste anzubieten und Zugriffe zu analysieren. Deine IP-Adresse und dein User-Agent werden zusammen mit Messwerten zur Leistung und Sicherheit für Google freigegeben. So können Nutzungsstatistiken generiert, Missbrauchsfälle erkannt und behoben und die Qualität des Dienstes gewährleistet werden.Weitere InformationenOk