AP on CompSciFutures

Saturday 10 February 2024

FORTHCOMING PAPER PRESENTATION

I've just had a Marketing Science paper accepted to a conference that is happening later this year.

Abstract is:

Predicting the Performance of Digital Advertising

Andrew Prendergast
Ex. Google, Nielsen//NetRatings, BBDO.

A first principles exploration of ethically sound, privacy-preserving simulation, prediction and evaluation of campaign optimization in a digital advertising setting, including publication and description of a number of anonymised paid advertising datasets from search and display campaigns across a multitude of clients.

We analysed the practical application of performance marketing by a digital media buying team in a large advertising agency and explored challenges faced by the business school graduate level campaign analysts in predicting performance of digital advertising transacted in Vickery, silent bid and private deal settings, and explored the utility of truthful and non-truthful bidding WRT risk preferences. Our study focuses on micro-conversion based ROI optimization of direct-response search and display activity, but found that the techniques developed are also applicable to “above the line” branding focused digital campaigns. We then rigorously executed several multi-million dollar search campaigns using the developed techniques and validated the Vickery hypothesis that accurate assessment of placement valuations and truthful bidding maximises long run expected utility and campaign optimization stability.

The techniques presented include a practical approach to placement valuation & bidding which uses a simple Bayesian prior and can be calculated in Excel. We compare it’s predictive performance to more exotic models using a “poor-mans-simulation” ML model evaluation technique and find the results are competitive. The evaluation technique is presented and we demonstrate its apriori simulation of future campaign performance from past ad-server data collected aposteriori. A selection of datasets to aid in replication and improvement of our experimental results are also provided.

... and I'll finally be finishing off this old blog post series on bidding.

Should be a good show. More details to follow.

Tuesday 14 March 2023

CURRENT STATE OF WEBGL / GLSL

If you are paying attention close enough, you might have noticed that in the background of this blog is a very simple and strightforward wave simulation using GLSL or WebGL. It has specifically been written in accordance with the OpenGL Shading Language, Second Edition [1,2] text in mind so it is compatible with every single WebGL implementation ever conceived, and is almost completely decoupled from the browser. It has some intentional 'glitch' in it, which is a reference to the analog days of Sutherland's work.

Specifically, the simulation uses a textbook implementation of GLSL, as follows:

GLSL 1.2
OpenGL ES 2.0
WebGL 1.0

The only coupling to the browser is the opening of a GL context, and if one clicks on the animation in the right place, an "un-project" operation that unwinds the Z-division takes place so that the fragment that underlies the mouse cursor can be calculated and the scene can be rotated using a very primitive rotation scheme which includes gimble lock (no quarternions here!). Both are extrordinarilly simple 3D graphics operations that should not affect rendering at all and is the absolute minimum level of coupling one might expect. In short, it is the perfect test of the most basic of WebGL capability.

The un-project operation is written with the minimal amount of code required and uses a little linear algebra trick to do it very efficiently. Feel free to inspect the code to see how it's done.

Current State

Update 26-Jun-23: After much trial and error, and testing on many many devices, I now have successfully isolated three separate WebGL bugs. Now that I have the three bugs properly isolated, I'm starting the writeup and hope to submit the following bug reports in the next week or two, as follows:

Incorrect rendering of WebGL 1.0 scene in Google Chrome
WebGL rendering heisenbug causes GL context to crash in Chrome after handling some N fragments
WebGL rendering heisenbug causes GL context to incorrectly render scene after handling some N fragments

TBC...

The current state as at 14-March-2023 is that Chrome and other browsers are not able to run this animation for more than 24 hours without crashing, and on the latest versions of Chrome released early March, the animation has now slowed down to ridiculous FPS levels. Previously the animation ran at well over 30 FPS on most devices, but would crash after 24 hours.

This animation will quite happily run on an old iPad running Safari, however Chrome currently seems to be struggling. The number of vertices and the the number of sin() operations that it needs to calculate is well within the capabilities of all modern processors, including those found in i-devices such as phones and tablets on which one can play a typical game.

Example of correct rendering on all modern browsers including Safari on iPad

Example of incorrect rendering on Chrome 111.0.5563.111 (64-bit) as at 23-Mar-23

Brave Browser (based on Chromium) renders correctly, but is horrendously slow in the GA branch (Beta currently works fine). I'm not sure if this is a Linux vs. Windows issue or discrete vs. embedded GPU at this stage, will investigate further when I have time.

NB. As at 14-March-2023, on Brave Browser 1.49.120 on Windows with a Discrete GPU the simulation struggles to render 5 FPS, and on Brave Browser 1.50.85 (beta) on Linux with an embedded GPU it works OK, but I can point to other vizualisation artefacts elsewhere that 1.50.85 cannot handle (but which previous versions of Brave/Chrome could), for example, on the homepage of vizdynamics.com, the Humanized Data Robot should gently move around the screen with a slight parallax effect if one moves the mouse over it. Why is the rendering engine in Chrome suddenly regressing, and why is it not using the GPU? This wave simulation should be able to run in it's entirety on SIMD architecture and the Humanized Data Robot used to be rendering flawlessly. What is going on?

At 30 FPS, the wave simulation requires around 75 MFLOPS of processing power. To put that into perspective, the first Sun Microsystems SPARC Station released in 1989 was able to calculate 16.2 MIPS (similar to MFLOPS), and the SPARC Station 2 (released 1991) could calculate nearly 30 MIPS. That was over 30 years ago, and a SPARC Station 2 machine had enough compute power that it could happily calculate the same wave simulation at around 10-15 FPS without vizualising it, but actually at 2-5 FPS once one implements a GPU pipeline (thank god SGI bought out IRIS GL).

I still have my copy of the original OpenGL Programming Guide (1st edition, 6th printing) that came with my Silicon Graphics Indy workstation. It was a curious book, and I implemented my first OpenGL version of these wave simulations in 1996 according to it, so I'm quite familiar with what to expect. An Indy could handle this - with a bit of careful tuning - quite well. The hardness of this vizualisation is the tremendous number of sin() operations, and the cosine()s used to take their derivative, so it really does test the compute power of a graphics pipeline quite well - if the machine or the implementation isn't up to it, these calculations will bring it to it's knees quite quickly.

Fast-forward to 2023, and a basic i7 cannot run the simulation at 30FPS! To put things into perspective, a 2010 era Intel i7 980 XE is capable of over 100 GFLOPS (about 1000x more processing power than whats required to do 75 MFLOPS), and that's without engaging any discrete or integrated SIMD GPU cores. Simply put, the animation in the background of this blog should be trivial for any computing device available today, and should run without interruption.

Lets see how well things progress through March and if things improve.

Update 30-Jan-24: Adding an FFT to make the wave simulation faster makes the problem go away, or potentially causes it to take longer to crash. Dunno. Have asked the Brave team to look into it:

Hey @brave @BraveSupport @BrendanEich While I have your attention:

I have isolated the long-standing bug with long running WebGL memory leaking till the browser crashes in Chromium on some GPUs. #SIGGRAPH @siggraph

More info here - please submit to the Chromium team, I…
— AP on CompSciFutures (∀/acc) (@CompSciFutures) January 29, 2024

References

[1] Rost, Randi J., and John M. Kessenich. OpenGL Shading Language. 2nd ed. Addison-Wesley, 2006. OpenGL 2.0 + GLSL 1.10

[2] Shreiner, Dave. OpenGL Programming Guide the Official Guide to Learning OpenGL, Version 2. 5th ed. Upper Saddle River, NJ u.a: Addison-Wesley, 2006.

Saturday 4 March 2023

YOU ARE ALL IN A VIRTUAL

I love Urban Dictionary - watch out for my @CompSciFutures updates. Here's one I posted today:

This is actually a commentary on attack surfaces, specifically that back in the day, one's lead architect knew the entirety of a system's attack surface and would secure it appropriately. Today, systems are so complex, that no single man knows the entire intricacies of the attack surface for even just a small web application with an accompanying mobile app. The security implications of this are profound, hence the reason why I am writing a textbook on the topic.

Link to more Urban Dictionary posts in the footer.

Tuesday 21 February 2023

SOFTWARE ENGINEERING MANUAL OF STYLE, 3RD EDITION

My apologies to everyone I was supposed to follow up with in January - I've been writing a textbook. I'll get back to you late February/early March, I'm locking down and getting this done so we can address the systemic roots of this ridicuous cyber security problem we have all found ourselves in.

The book is called:

The Software Engineering Manual of Style, 3rd Edition
A secure by design, secure by default perspective for technical and business stakeholders alike.

The textbook is 120 pages of expansion on a coding style guide I have maintained for over 20 years and which I hand to every engineer I manage. The previous version was about 25 pages, so this edition is a bit of a jump!

Secure-by-design, secure-by-default software engineering. The handbook.

It covers the entirety of software engineering at a very high level, but has intricate details of information security baked into it, including how and why things should be done a certain way to avoid building insecure technology that is vulnerable to attack. Not just tactical things, like avoiding 50% of buffer overruns or most SQL injection attacks (and leaving the rest of the input validation attacks unaddressed). This textbook redefines the entire process of software engineering, from start to finish, with security in mind from page 1 to page 120.

Safe coding and secure programming are not enough to save the world. We need to start building technology according to a secure-by-design, secure-by-default software engineering approach, and the world needs a good reference manual on what that is.

This forthcoming textbook is it.

Latest Excerpts

21-Feb-23 Excerpt: The Updated V-Model of Software Testing
(DOI: 10.13140/RG.2.2.23515.03368)

21-Feb-23 Excerpt: The Software Engineering Standard Model
(DOI: 10.13140/RG.2.2.23515.03368)

EDIT 22-Mar-23: Proof showing that usability testing is no longer considered non-functional testing
(DOI: 10.13140/RG.2.2.23515.03368)

EDIT 22-Mar-23: The Pillars of Information Security, The attack surface kill-switch riddle +
The elements of authenticity & authentication (DOI: 10.13140/RG.2.2.12609.84321)

EDIT 22-Apr-23: The revised Iterative Process of Modelling & Decision Making
(DOI: 10.13140/RG.2.2.11228.67207/1)

EDIT 18-May-23: The Lifecycle of a Vulnerability
(DOI: 10.13140/RG.2.2.23428.50561)

Audience

I'm trying to write it so it's processes and methodologies:

Can be baked into a firm by CXOs using strategic management principles; or
embraced directly by engineers and their team leaders without the CEOs shiny teeth and meddlesome hands getting involved.

Writing about very technical matters for both audiences is hard and time consuming, but I think I'm getting the hang of it!

Abstract from the cover page

The foreword/abstract from the first page of the text reads as follows:

"The audience of this textbook is engineering based, degree qualified computer science professionals looking to perfect their art and standardise their methodologies and the business stakeholders that manage them. This book is not a guide from which to learn software engineering, but rather, offers best practices canonical guidance to existing software engineers and computer scientists on how to exercise their expertise and training with integrity, class and style. This text covers a vast array of topics at a very high level, from coding & ethical standards, to machine learning, software engineering and most importantly information security best practices.

It also provides basic MBA-level introductory material relating to business matters, such as line, traffic & strategic management, as well as advice on how to handle estimation, financial statements, budgeting, forecasting, cost recovery and GRC assessments.

Should a reader find any of the topics in this text of interest, they are encouraged to investigate them further by consulting the relevant literature. References have been carefully curated, and specific sections are cited where possible."

The book is looking pretty good: it is thus far what it is advertised to be.

Helping out and donating

The following link will take you to a LinkedIn article that I am publishing various pre-print extracts (some are also published above).

If you are in the field of computer science or software engineering, you might be able to help by providing some peer-review. If not, there is a link to an Amazon booklist that you can also contribute to this piece of work by donating a book or two.

Visit the book's homepage on LinkedIn »

And feel free just to take a look and see where we're going and what's being done to ensure that moving forward, we stop engineering such terribly insecure software. Any support to that end would be most appreciated.

Edited 22-Mar-23: added usability testing proof
Edited 22-Mar-23: added Pillars of Cybersecurity

Sunday 1 January 2023

VIZLAB 2.0 CLOSURE

VizDynamics is still a trading entity, but VizLab 2.0 is now closed. The lack of attention to cybersecurity by state actors, big tech and cloud operators globally made it impossibly difficult to continue operating an advanced computer science lab with 6 of Melbourne's best computer scientists supporting corporate Australia. We could have continued, but we saw this perfect storm of cyber security coming and decided to dial down VizLab starting in 2017. Given recent cyber disasters, it is clear we made the right decision.

State actors, cloud operators and big tech need to be careful with “vendor backdoor” legislation such as key escrow of encryption, because this form of 'friendly fire' hides the initial attack vector and the intial point of network contact when trying to forensically analyse and close down real attacks. Whilst that sort of legislation is in place without apporpriate access controls, audit controls, detective controls, cross-border controls, kill switches and transparency reporting, it is not commercially viable for us to operate a high powered CS lab, because all the use cases corporate Australia want us to solve involve cloud-based PII, for example, ‘Prediction Lakes’.

A CS Lab designed for paired programming

Part MIT Media lab, part CMU Autonlab, part vizualistion lab, part paired-programming heaven: this was VizLab 2.0.

VizLab doorway with ingress & egress Sipher readers, Inner Range high frequency monitoring & enterprise class CCTV.
A secure site in a secure site.

A BSOD in The Lab: The struggles of WebGL and 3D everywhere.
Note the 3-screen 4k workstation in the foreground designed
for paired local + remote programming.

The Lab - Part vizualisation, part data immersion. VizLab 2.0.
Note the Eames at the end of the centre aisle.

Each workstation was setup for paired programming. 6 workstations, where you could plug in 2 keyboards, 2 mice, 2 chairs side-by-side and enough room so that you weren't breathing on eachother with returns either side big enough for a plethora of academic texts, client specs and all the notes you could want. With 2 HDMI cables hung from the roof linking to projectors on opposing walls that could reach any workstation at any moment, this was an environment for working on hard things; collaboratively, together.

Note the lack of client seating. They would have to perch on the edge of a desk and see everything, or an Eames lounge at the end of the room and try to see everything, or a White real leather Space Furniture couch next to a Tom Dixon and see nothing - we took host to management teams from a plurality of ASX200s, and the first thing that struck them — we didn't have seating for them, because the next few hours they were going to be moving around and staring at walls, computers, the ceiling — there simply was nowhere to sit, when a team of computer scientists, trained in computer graphics and very deft with data science were taking them on a journey into Data.

The now: VizLab 3.0 – The future: VizLab 4.0

AP is still around and is spending the most part of 2023 writing a textbook on secure software engineering, and we've setup a smaller two-man VizLab 3.0 for cyber defence research mainly around GRC assessment and computer science education both at a secondary and a tertiary level. AP is currently doing research in that field so we can hopefully reduce cyber-risk down to a level that is acceptable to coprorate Australia by increasing the mathematical and cyber-security awareness and literacy of computer science students as they enter into university and then industry.

If we can help to create that environment, then VizLab 4.0 may materialise and will be bigger and better, but because we dialed back our insurances (it's not practical to be paying $25K pa while we're doing research), we aren't in a position to provide direct consultation services at the moment. VizDynamics is still a trading entity, and a new visualisation based Information Security brand might be launched somewhere in 2024 based on the “Humanizing Data” vizualisation thesis (perhaps through academia or government – we’re not sure yet). Fixes are happening to WebGL rendering engines, and slowly cyber security awareness is rising to the top of the agenda, so our work is slowly shifting the needing and we’re moving in the right direction.

If you want to keep track of what AP's up to, vizit blog.andrewprendergast.com.

Sunday 4 December 2022

THE PATH TO BUILDING STARK-TREK STARSHIPS

Saturday 5 November 2022

THE MIND BLOWING HARBINGER OF WIRED 1.01

Wired magazine 1.01 was published in 1993 by Nicholas Negreponte & Louis Rosetto.

Every issue contained inside the front cover a 'Mind Grenade', and the one from the very first issue (1.01) is -- in hindisght -- creepy. Here it is:

The 'mind grenade' from Wired 1.01, Circa March 1993.

Damn Professor Negroponte, you ring truer every day.

Friday 21 October 2022

RECOMMENDER SYSTEMS

Recommender systems are huge outside of Australia and USA such that most marketing managers now consider their optimisation as important as Search Engine Marketing (SEM). I can't believe we have totally missed the ball on this one, and nobody on the other side of planet, from Dubai to London has bothered to tell us!

Anwyays, here's the original seminal paper that Andreas Wiegend (ex Stanford, market genius and inventor of Prediction Markets and The BRIC Bank, Chief Scientist Emeritus of Amazon.com and inventor of recommender systems) directed and promoted this paper. It's based on proper West Coast Silicon Valley AI, with a quality discussion about a number of related techologies and market related effects that impact recommender systems.

Enjoy!

Sunday 17 July 2022

I LOVE FOURIER DOMAIN

I've been playing with building a Swarm Intelligence simulator based on a fourier domain discretisation to schedule the placement of drones in 3D space and cars in 2D space. Here's a little video demo of it's basic structure in action, on top of this is some differential equations to capture the displacement field, then drone position coords:

LinkedIn post with a video demo of the simulator in structural mode.
You need to be logged into LinkedIn to see the post.

If you want to have a play with this class of sine wave, you might notice a simpler simulation in the background of this blog. It has a few extra features not normally seen of these types of simulation: instead of a single point being able to move along one axis (usually the Y-axis), every point in my simulation can move anywhere along the X, Y or Z axis. Take a look yourself, left-click and drag the mouse on the background (where the 3D simulation is happening) to rotate the simulation in realtime. Look below the surface to see the mesh, above it and you get a flat view. 

For best effect, try full-screen browser, remove all content and view just the background wave simulation.

Sunday 31 March 2019

MY FAVOURITE VIZ OF ALL TIME

How Google used vizualisation to become one of the worlds most valuable companies

At VizDynamics we have done a lot of 'viz'-ualisation, so I’ve seen more than several life-times worth of dashboards, reports, KPIs, models, metrics, insights and all manner of presentation and interaction approaches thereof.

Yet one Viz has always stuck in my mind.

More than a decade ago when I was post start-up exit and sitting out a competitive-restraint clause, I entertained myself by travelling the world in search of every significant thought leader and publication about probabilistic reasoning that I could find. Some were very contemporary; others were most ancient. I tried to read them all.

A much younger me @ the first Googleplex (circa 2002)

Some of this travelling included regular visits to Googleplex 1.0, back before they floated and well before anyone knew just how much damn cash they were making. As part of these regular visits, I came across a viz at the original ‘Plex that blew me away. It sat in a darkened hall in a room full of engineers on a small table at the end of a row of cubicles. On this little IBM screen was an at-the-time closely guarded viz:

The "Live Queries" vizualisation @ Googleplex 1.0

Notice the green data points on the map? They are monetised searches. Notice the icons next to the search phrases? More “$” symbols meant better monetisation. This was pre-NPS, but the goal was the same – link $ to :) then lay it bare for all to see.

What makes this unassuming viz so good?

It's purpose.

Guided by Schmidt’s steady hand, Larry & Sergey (L&S) had amassed the brainpower of 300+ world leading engineers, then unleashed them by allowing them to work independently. They now needed a way for them to self-govern and -optimise their continual improvements to product & revenue whilst keeping everyone aligned to Google's users-first mantra.

The solution was straightforward: use vizualisation to bring the users into the building for everyone to see, provide a visceral checkpoint of their mission and progress, and do it in a humanely digestible manner.

Simple in form & embracing of Tufteism, the bottom third of the screen scrolled through user searches as they occurred, whilst the top area was dedicated to a simple map projection showing where the last N searches had originated from. An impressively unpretentious viz that let the Data talk to one’s inner mind. The pictograph in the top section was for visual and spatially aware thinkers, under that was tabular Data for the more quantitative types. And there wasn’t a single number or metric in sight (well not directly anyway). Three obviously intentional design principles executed well.

More than just a Viz, this was a software solution to a plurality of organizational problems.

To properly understand the impact, imagine yourself for a moment as a Googler, briskly walking through the Googleplex towards your next meeting or snack or whatever. You alter your route slightly so you can pass by a small screen on your way through. The viz on the screen:

instantly and unobtrusively brought you closer to your users,
persistently reminded you and the rest of the (easily distracted) engineers to stay focused on the core product,
provided constant feedback on financial performance of recent product refinements, and
inspired new ideas

before you continued down the hall.

The best vizualisations humanise difficult data in a visceral way

This was visual perfection because it was relevant to everyone, from L&S down to the most junior of interns. Every pixel served a purpose, coming together into an elegantly simple view of Google's current state. Data made so effortlessly digestible that it spoke to one’s subconscious mind with only a passing glance. A viz so powerful that it helped Google to become one of the world’s most valuable companies. This was a portal into people's innermost thoughts and desires as they were typing them into Google. All this... on one tiny little IBM screen, at the end of a row of cubicles.

Thursday 1 June 2017

ACLAND STREET – THE GRAND LADY OF STKILDA

Acland Street is the result of two years of research. As well as extended archival and social media research, more than 150 people who had lived, worked, and played in Acland Street were interviewed to reveal its unique social, cultural, architectural, and economic history.

Of course we got a mention on page 133:

Note the special mention under 'The Technology', page 133. Circa 1995, published 2017.

Tuesday 1 September 2015

CXO LEADERS SUMMIT

"Supporting your intuition with large scale data," Andrew Prendergast, Chief Scientist, VizDynamics #CXOLeadersSummit
— CXO Leaders (@CXOLeaders) August 20, 2015

Thursday 30 October 2014

INTRO TO BAYESIAN REASONING LECTURE

Here's a quick one to make the files available online from today's AI lecture at RMIT University. Much thanks to Lawrence Cavedon for making it happen.

Downloads

Lecture Notes (PDF)

Course/grade/intelligence plate model example (Netica)

Output from sneezing diagnosis class exercise (Netica)

Burgular hidden markov model example (Netica)

Same burgular HMM in Excel (Excel)

Have fun and feel free to email me once you get your bayes-nets up and running!

Thursday 16 October 2014

ACCESSING DATA WAREHOUSES WITH MDX RUNNER

It's always good to give a little something back, so each year I do some guest lecturing on data warehousing to RMIT's CS Masters students.

We usually pull a data warehouse box out of our compute cloud for the session so I can walk through the whole end-to-end stack from the hardware through to the dashboards. The session is quite useful and always well received by students.

This year the delightful Jenny Zhang and I showed the students MDX Runner, an abstraction used at VizDynamics on a daily basis to access our data warehouses. As powerful as MDX is, it has a steep learning curve and the result sets it returns can be bewildering to access programmatically. MDX Runner eases this pain by abstracting out the task of building and consuming MDX queries.

Given that it has usefulness far beyond what we do at VizDynamics, I have made arrangements for MDX Runner to be open-sourced. If you are running analysis services or any other MDX-compatible data warehousing environment, take a look at mdxrunner.org - you will certainly find it useful.

Do reach out with updates if you test it against any of the other BI platforms. Hopefully over time we can start building out a nice generalised interface into Oracle, Teradata, SAP HANA and SSAS.

Saturday 13 September 2014

BIDDING: AXIOMS OF DIGITAL ADVERTISING + TRAFFIC VALUATION

In this post I share a formal framework for reasoning about advertising traffic flows, is how black box optimisers work and needs to be covered before we get into any models. If you are a marketer, then the advertising stuff will be old-hat and if you are a data scientist then the axioms will seem almost obvious.

What is useful is combining this advertising + science view and the interesting conclusions about traffic valuation one can draw from it. The framework is generalised and can be applied to a single placement or to an entire channel.

Creative vs. Data – Who will win?

I should preface by saying my view on creative is that it is more important than the quality of one's analysis and media buying prowess. All the data crunching in the world is not worth a pinch if the proposition is wrong or the execution is poor.

On the other hand, an amazing ad for a great product delivered to just the right people at the perfect moment will set the world on fire.

Problem Description

The digital advertising optimisation problem is well known: analyse the performance data collected to date and find the advertising mix that allocates the budget in such a way that maximises the expected revenue.

This can be divided into three sub-problems: assigning conversion probabilities to each of the advertising opportunities; estimating the financial value of advertising opportunities; and finding the Optimal Media Plan.

The most difficult of these is the assessment of conversion probabilities. Considering only the performance of a single placement or search phrase tends to discard large volumes of otherwise useful data (for example, the performance of closely related keywords or placements). What is required is a technique that makes full use of all the data in calculating these probabilities without double-counting any information.

The Holy Triumvirate of Digital Advertising

In most digital advertising marketplaces, forces are such that traffic with high conversion probability will cost more than traffic with a lower conversion probability (see Figure 1). This is because advertisers are willing to pay a premium for better quality traffic flows while simultaneously avoiding traffic with low conversion probability.

Digital advertising also possesses the property that the incremental cost of traffic increases as an advertiser purchases more traffic from a publisher (see Figure 2). For example, an advertiser might increase the spend on a particular placement by 40%, but it is unlikely that any new deal would generate an additional 40% increase in traffic or sales.

Figure 1: Advertiser demand causes the cost of traffic to increase with conversion probability

Figure 2: Publishers adjust the cost of traffic upward exponentially as traffic volume increases

To counter this effect, sophisticated marketers grow their advertising portfolios by expanding into new sites and opportunities (by adding more placements), rather than by paying more for the advertising they already have. This horizontal expansion creates an optimisation problem: given a monthly budget of $x, what allocation of advertising will generate the most sales? This configuration then is the Optimal Media Plan.

Figure 3: The Holy Triumvirate of Digital Advertising: Cost, Volume and Propensity

NB: There are plenty of counterexamples when this response surface is observed in the wild. For example with Figure 2, some placements are more logarithmic than exponential, while others are a combination of the two. A good agency spends their days navigating and/or negotiating this so that one doesn't end up over paying.

To solve the Optimal Media Plan problem, one needs to know three things for every advertising opportunity: the cost of each prospective placement; the expected volume of clicks; and the propensity of the placement to convert clicks into sales (see Figure 3). This Holy Triumvirate of Digital Advertising (cost, volume and propensity) is constrained along a response surface that ensures that low cost, high propensity and high volume placements occur infrequently and without longevity.

For the remainder of this post (and well into the future), propensity will be considered exclusively in terms of ConversionProbability. This post will provide a general framework for this media plan optimisation problem and explore how ConversionProbability relates to search and display advertising.

A GRAND THESIS

Oh dear, the game is up. Our big secret is out. We should have a parade.

The Future of Modernity

This year is looking like when computer scientists come out and confess that the world is undergoing a huge technology driven revolution based on simple probabilities. Or perhaps it's just that people have started to notice the rather obvious impact it is making on their lives (the hype around the recent DARPA Robotics Challenge and Christine Lagarde's entertaining lecture last month are both marvelous example of that).

This change is to computer science what quantum mechanics was to physics: a grand shift in thinking from an absolute and observable world to an uncertain and far less observable one. We are leaving the digital age and entering the probabilistic one. The main architects of this change are some very smart people and my favorite super heroes - Daphne Koller, Sebastian Thrun, Richard Neapolitan, Andrew Ng and Ron Howard (no not the Happy Days Ron – this one).

Behind this shift are a clique of innovators and ‘thought leaders’ with an amazing vision of the future. Their vision is grand and they are slowly creating the global cultural change they need to execute it. In their vision, freeways are close to 100% occupied, all cars travel at maximum speed and the population growth declines to a sustainable level.

This upcoming convergence of population to sustainable levels will not come from job-stealing or killer robots, but from increased efficiency and the better lives we will all live, id est, the kind of productivity increase that is inversely proportional to population growth.

And then world is saved... by computer scientists.

What is it sort of exactly-ish?

Classical computer science is based on very precise, finite and discrete things, like counting pebbles, rocks and shells in an exact manner. This classical science consists of many useful pieces such as the von-neumann architecture, relational databases, sets, graph theory, combinatorics, determinism, greek logic, sort + merge, and so many other well defined and expressible-in-binary things.

What is now taking hold is a whole different class of computer-ey science, grounded in probabilistic reasoning and with some other thing called information theory thrown in on the sidelines. This kind of science allows us to deal in the greyness of the world. Thus we can, say, assign happiness values to whether we think those previously mentioned objects are in fact more pebbly, rocky or shelly given what we know about the time of day and its effect on the lighting of pebble-ish, rock-ish and shell-ish looking things. Those happiness values are expressed as probabilities.

The convenience of this probability-based framework is its compact representation of what we know, as well as its ability to quantify what we do not(ish).

Its subjective approach is very unlike the objectivism of classical stats. In classical stats, we are trying to uncover a pre-existing independent, unbiased assessment. In the Bayesian or probabilistic world bias is welcomed as it represents our existing knowledge, which we then update with real data. Whoa.

This paradigm shift is far more general than just the building of robots - it's changing the world.

I shall now show you the evidence so you may update your probabilities

A testament to the power of this approach is that the market leaders in many tech verticals already have this math at their heart. Google Search is a perfect example - only half of their rankings are PageRank based. The rest is a big probability model that converts your search query into a machine-readable version of your innermost thoughts and desires (to the untrained eye it looks a lot like magic).

If you don’t believe me, consider for a moment, how does Facebook choose what to display in your own feed? How do laptops and phones interpret gestures? How do handwriting, speech and facial recognition systems work? Error Correction? Chatbots? Emotion recognition? Game AI? PhotoSynth? Data Compression?

It’s mostly all the same math. There are other ways, which are useful for some sub-problems, but they can all ultimately be decomposed or factored into some sort of Bayesian or Markovian graphical probability model.

Try it yourself: Pickup your iPhone right now and ask the delightful Siri if she is probabilistic, then assign a happiness value in your mind as to whether she is. There, you are now a Bayesian.

APAC is missing out

Notwithstanding small pockets of knowledge, we don’t properly teach this material in Australia, partly because it is so difficult to learn.

We are not alone here. Japan was recently struck down by this same affliction when their robots could not help to resolve their Fukushima disaster. Their classically trained robots cannot cope with changes to their environment that probabilities so neatly quantify.

To give you an idea of how profound this thesis is, or how far and wide it will eventually travel, it is currently taught by the top American universities across many faculties. The only other mathematical discipline that has found its way into every aspect of science, business and humanities is the Greek logic, and that is thousands of years old.

A neat mathematical magic trick

The Probabilistic Calculus subsumes Greek Logic, Predicate Logic, Markov Chains, Kalman Filters, Linear Models, possibly even Neural Networks; that is, because they can all be expressed as graphical probability models. Thus logic is no longer king. Probabilities, expected utility and value of information are the new general purpose ‘Bayesian’ way to reason about anything, and can be applied in a boardroom setting as effectively as in the lab.

One could build a probability model to reason about things like love, however it's ill advised. For example, a well-trained model would be quite adept at answering questions like “what is the probability of my enduring happiness given a prospective partner with particular traits and habits.”

The ethical dilemma here is that a robot built on the Bayesian Thesis is not thinking as we know it – it's just a systematic application of an ingenious mathematical trick to create the appearance of thought. Thus for some things, it simply is not appropriate to pretend to think deeply about a topic; one must actually do it.

We need bandwidth or we will devour your 4G network whole

These probabilistic apps of the future (some of which already exist) will drive bandwidth-hogging monsters (quite possibly literally) that could make full use of low latency fibre connections.

These apps construct real-time models of their world based on vast repositories of constantly updated knowledge stored ‘in the cloud’. The mechanics of this requires the ability to transmit and consume live video feeds, whilst simultaneously firing off thousands of queries against grand mid- and big-data repositories.

For example, an app might want to assign probabilities to what that shiny thing is over there, or if its just sensor noise, or if you should buy it, or if you should avoid crashing into it, or if popular sentiment towards it is negative; and, oh dear, we might want to do that thousands of times per second by querying Flickr, Facebook and Google and and and. All at once. Whilst dancing. And wearing a Gucci augmented reality headset, connected to my Hermes product aware wallet.

This repetitive probability calculation is exactly what robots do, but in a disconnected way. Imagine what is possible once they are all connected to the cloud. And then to each other. Then consider how much bandwidth it will require!

But, more seriously, the downside of this is that our currently sufficient 4G LTE network will very quickly be overwhelmed by these magical new apps in a similar way to how the iPhone crushed our 3G data networks.

Given that i-Devices and robots like to move around, I don't know whether FTTH would be worth the expense, but near-FTTH with a very high performance wireless local loop certainly would help here. At some point we will be buying Hugo Boss branded Occulus Rift VR headsets, and they need to plug into something a little more substantive than what we have today.

Ahh OK, what does this have to do with advertising?

In my previous post I said I would be covering advertising things. So here it is if you haven't already worked it out: this same probability guff also works with digital advertising, and astonishingly well.

There I said it, the secret is out. Time for a parade.

...some useful bits coming in the next post.

Friday 21 March 2014

OH HAI

Fab, I’m blogging.

A Chump’s Game

A good friend of mine, whilst working at a New York hedge fund once said to me, “online advertising is a chump’s game”.

At the time he was exploring the idea of constructing financial instruments around the trade of user attention. His comment was coming from just how unpredictable, heterogeneous and generally intractable the quantitative side of advertising can be. Soon after, he quickly recoiled from the world of digital advertising and re-ascended back into the transparent market efficiency of haute finance; a world of looking for the next big “arb”.

What I Do

I am a data scientist and I work on this problem every day.

Over the last 15 or so years I have come to find that digital advertising is, in fact, completely the opposite of a chump's game – yes, media marketplaces are extraordinarily opaque and highly disconnected – but with that comes fantastically gross pricing inefficiencies exploitable in a form of advertising arbitrage.

The Wall Street guys saw this, but never quite cracked how to exploit it.

What you will find here

If you have spent more than a little time with me, then in between mountain biking and heli-boarding at my two favorite places in the world, you will have probably heard about or seen a probability model or two.

In the coming months I will be banging on about some of this, and in particular sharing a few easy tricks on how advertisers can use data to gain a bit of an advantage. With the right approach, it’s rather simple.
The concepts I will present here are already built into the black-box ad platforms we use daily, the foundations of which are two closely related assumptions:

Any flow of advertising traffic has people behind it whom possess a fixed amount of buying power and a more elastic willingness to exercise it.
As individuals we are independent thinkers, but as a swarm, we behave in remarkably predictable ways.

My aim is that one will find the material useful with little more than a bit of Excel and one or two free-ish downloads. The approach is carefully principled, elegantly simple and astonishingly effective.

Achtung! This site makes use of in-browser 3D. If your computer is struggling, then you probably need a little upgrade, a GPU or a browser change. Modern data science needs compute power, and alot of it.

The format is a mix of theory, worked examples and how-to, combined with a touch of spreadsheet engineering. A dear friend of mine – whom has written more than a few articles for the Economist - will be helping me edit things to keep the technical guff to a minimum.

I am hoping along the way that a few interesting people might also compare their own experiences and provide further feedback and refinement. If its well received then we might scale up the complexity.

So, if digital advertising is your game then stay tuned, this will be a bit of fun!

Monday 16 July 2007

SORTING DATA FRAMES IN R

This is a grandfathered post copied across from my old blog when I was using MovableType (who remembers MovableType?!)

I frequently find myself having to re-order rows of a data.frame based on the levels of an ordered factor in R.

For example, I want to take this data.frame:

       product store sales
        1       a    s1    12
        2       b    s1    24
        3       a    s2    32
        4       c    s2    12
        5       a    s3     9
        6       b    s3     2
        7       c    s3    29

And sort it so that the sales data from the stores with the most sales occur first:

   product store sales
   3       a    s2    32
   4       c    s2    12
   5       a    s3     9
   6       b    s3     2
   7       c    s3    29
   1       a    s1    12
   2       b    s1    24

I keep forgetting the exact semantics of how its done and Google never offers any assistance on the topic, so here is a quick post to get it down once and for all, both for my own benefit and the greater good. First we need some data:

   productSalesByStore = data.frame(
         product = c('a', 'b', 'a', 'c', 'a', 'b', 'c'),
         store = c('s1', 's1', 's2', 's2', 's3', 's3', 's3'),
         sales = c(12, 24, 32, 12, 9, 2, 29)
      )

Now construct a sorted summary of sales by store:

   storeSalesSummary =
         aggregate(
                  productSalesByStore$sales,
                  list(store = productSalesByStore$store),
         sum)
   storeSalesSummary =
      storeSalesSummary[ 
         order(storeSalesSummary$x, decreasing=TRUE), 
         ]

storeSalesSummary should look like this:

Use that summary data to construct an ordered factor of store names:

   storesBySales =
      ordered(
         storeSalesSummary$store,
         levels=storeSalesSummary$store
         )

storesBySales is now an ordered factor that looks like this:

      [1] s2 s3 s1
  Levels: s2 < s3 < s1

Re-construct productSalesByStore$store so that it is an ordered factor with the same levels as storesBySales

   productSalesByStore$store =
      ordered(productSalesByStore$store, levels=storesBySales)

Note that neither the contents nor the order of productSalesByStore has changed (yet). Just the datatype of the store column. Finally, we use the implicit ordering of store to generate an explicit permutation of productSalesByStore so that we can sort the rows in a stable manner:

   productSalesByStore = 
      productSalesByStore[ order(productSalesByStore$store), ]

And we are done!

Tuesday 17 August 1999

PARALLEL READ/WRITE LOCKING AND MY ORACLE ADVENTURE

Back when Oracle 8i was a thing, Fibre Channel was all the rage and on the Oracle roadmap was Oracle Parallel Database, I was called into the Oracle HQ in Redwood City, Silicon Valley.

They'd pushed back the release of "Parallel" quarter after quarter and a couple senior engineers caught wind that I was the guy for doing custom built memory memory managers and I was doing big highly parallelised data structures on large SGI platforms. I had one data structure running on 16 racks of Silicon Graphics kit at a large data centre off highway 101 for an investment bank (aka hedge fund), and I'd earned a reputation for being able to do parallel distributted read/write locking of vast data structures with all CPU threads running at full speed and without any mutex locking or blocking. So, I was summoned to Silicon Valley "for a chat".

Oracle had this grand idea for Oracle 8i to share fibre channel LUNs between hosts, and your federated database would sit on one LUN with multiple Oracle 8i instances on separate machines all accessing the same database in parallel (hence the name 'Parallel'). Oracle at the time was actively influencing the specs of fibe channel (FCAL), but they just couldn't get it to work -- so I was called in so they could pick my brains. The visit was fun, but I was no dummy, and I certainly wasn't going to give up the secrets to how to build the worlds fastest computing systems.

I found the meeting quite entertaining and it descended into an argument over Oracle's outrageous pricing. On a multi-cpu system craylinked together with other multi-cpu systems, why should I pay Oracle a licensing fee for every damn CPU when we had called in Mark Gurry (the guy that wrote the book on Oracle Performance Tuning), tuned the crap out of Oracle so that it barely used a single CPU, maybe two under heavy load. I won the argument and secured special pricing for Oracle moving forward (possibly not what they had intended for our meeting - oh well, that's AP for you!)

A much more youthful looking me standing next to the Oracle lake after our meeting

Tuesday 27 January 1998

GEEKZ ON DEMAND

Geekz on Demand (G.O.D) was a HR consultancy I started back in 1997 with Richard Taylor. At the time it was tech boom 1.0, and there was a dearth of talent that properly knew what the Internet was and how to get things onto it.

Entre The Geekbase, and it took off like a rocket, landing us in the news quite consistently:

Rowan and I on the cover of The Age, 27 Jan 1998.