Cover photo for Eric O'Neil
The below is a masterful oil painting of the Sears Tower at dawn. You can't find it at the Art Institute, or the Met, or even through a Google search. It sits as copy 1 of 1 on my hard drive (and now in this post). The painting is a wholly original creation by a Machine Learning model called DALL-E, built by OpenAI, crafted from the prompt I gave it: "oil painting of downtown Chicago at sunrise."  DALL-E and similar AI models (Midjourney is another popular one) have burst onto the scene, to much amazement and consternation. They are winning art contests. They are redefining what it means to be an artist or designer. And they are just getting started.  It's not hard to imagine the above picture going alongside a fake, breaking news article about Ukraine bombing a Russian children's hospital. At a moment where trust in institutions are at an all-time low, where truth and facts are already up for debate, the slow, seemingly inexorable move towards AI-generated video content feels like a disaster waiting to happen. Another tool in the deepfake arsenal. This will be a major problem that requires the cooperation of governments and the tech industry to solve. We need smart public policy and technical tools that are just as smart on detecting AI-generated content as they are at creating it. We all know that government policy is reactive, so the onus will be on the private sector to find ways to mitigate the problem before any damage is done. Perhaps there is a way to incentivize the big players in this space to police each other. For the time being, it's important to stay aware of what's out there, and to recognize that there are always risks alongside the benefits of technological change. When it comes to AI, the stakes are high enough that we need to neutralize the risks before we can fully embrace the conveniences. 
Read newest post →

More recent posts

Stop Trying to Quantify the Dollar Value of A/B Testing. The Only Thing That Matters is Lessons-Learned.

I recently wrapped up an A/B test on a mobile application that brings in $6B+ in annual revenue. It was a failure. Our hypothesis was wrong, and we ended up funneling real-world consumers to a modified user experience that led them to spend less money than they otherwise would have. Whoops. The difference was marginal -- less than a dollar per user -- but multiply that over millions of site visits, and it starts to add up, quickly. The good news is that, when it comes to A/B testing, there are no failures. In this case, learning what consumers don't want is just as valuable as learning what they do want. Here's what my team and I learned: • We were able to validate our current user experience, and confirm past experiments that had led us to that point • We were able to confidently reject our hypothesis, allowing us to spend our resources elsewhere • We were able to use the above to design our next experiments • We got one step closer to optimizing the app's landing page Perhaps unsurprisingly, given the current economic climate, with budgets tightening and spending under a great microscope, I was asked to quantify the cost -- and/or revenue -- from our team's ongoing A/B testing. And that's fine, and relatively easy to do. Because our B variant was a money loser, the "cost" to learn the aforementioned lessons was in the six-figures. But that misses the point. The main benefit in quantifying the dollar value of an experimentation campaign is to be able to justify that campaign to non-technical management. The real benefit -- and much harder to quantify, though it can be done -- is in the lessons-learned, and the decisions made based on those lessons. Military jargon alert: A/B testing, and the decisions made from the outcomes, is your classic intel-operations cycle. Translated into pop-economic terms: Incremental updates to your product, made from A/B testing, creates a flywheel effect that grows your business. What you should be focused on is capturing a repository of past, present and future (planned) experiments, tied to the lessons learned and the decisions points made from each. Because there's no finish line when it comes to optimizing your business and products, experimentation never ends. Instead, you should be designing tests along two parallel tracks: 1. "Bite-Size" changes: Here is where you're going to create smaller, incremental tweaks to your user experience. Your goal here is to find the "optimal" experience, as defined by your North Star Metric. Because technology evolves and human beings get tired, you tend to see a decay to any lift you get from your A/B testing -- part of why you should be experimenting in the first place. 2. "Big Swings" changes: Concurrently with your bite-sized campaign, you should be funneling about 5% of your user base to radical re-design of your user experience. This is a low-risk way to potentially uncover big lifts, big insights, and position you for when it's time for a big upgrade. Big swings are also the preferred experimentation technique for products with lower traffic volumes, where it's harder to reach the desired confidence interval (see postscript) with small tweaks. Here's a great example of how Groove took a big swing and doubled their conversion rate. One interpretation of my "failed" test is that we accidentally siphoned consumers into an experience that cost our company money. The correct interpretation is that the company paid for insights into their users that they couldn't get anywhere else. Insights that allow us to keep innovating, and that will continue to drive engagement and revenue well into the future.  So, while it's not wrong to quantify the dollar value of your A/B testing, understand that it's only a sideshow. The real value is in driving that flywheel, fueling your decision-making with data, and then using the decisions you've made to design the next test. And how you capture that is what really matters. Postscript: "Failure" A/B tests exist -- when they're poorly designed and lead to data that falls below the 95% confidence interval (your undergrad or MBA Statistics class coming back to haunt you). In that case, because there is greater than a 1/20 likelihood you arrived at the final results by chance, you cannot draw any conclusions from the results. Some companies are willing to accept a confidence level as low as 80%, but the industry standard is 95%, with a t-value of 1.96.
Read more →

Why I'm starting my Tech career at Nike

It's past midnight in swampy North Carolina. After wading through waist-high water, you're drenched and cold. Kneeling by your radio, your night vision goggles have turned night into day: you can see your team scattered throughout the woods, each working on his own task. Far from the nearest cellphone tower, you send a report directly up into space, where a network of geosynchronous satellites bounce the packets of information from your radio down to headquarters. Soon thereafter, they update the last known enemy position on their computer, and you see that change in real-time on your Android-powered device. Next to you, the JTAC talks to the pilot silently orbiting above you, and watches the aircraft's camera feed on his own device as it shifts over to this new location. On a different radio, you tell the team over a mesh network that you'll be moving out in 2 minutes -- only 3 more miles through the woods to the objective. I'm sure my fellow grunts and operators can relate to the above snapshot from a training exercise in the North Carolina heat. Nothing combines intense physicality and a mental challenge quite like life down with our nation's ground troops. There, your equipment -- your physical gear, clothing, and boots as well as your technical equipment -- really has to work. More than that, it has to work so well that you almost forget it exists. And when you're out in the middle of nowhere, the technology you carry with you is the only lifeline back to the real world. Last weekend, after a fantastic first week at Nike, a group of us drove down to Crater Lake. Turns out, we were a week too early -- we drove in through a snowstorm that shattered some of our hiking plans. Undeterred, and finding the one trail that had been (partly) shoveled clear of snow, I went for a long run along the Crater's rim. Armed with my Wildhorse 7 Trail runners, tracking my run (and location!) with the Nike Run Club app, all while zigzagging around boulders, snowdrifts, and the occasional deer, I was able to reflect back on my transition from the military, my first year at Chicago Booth, and my internship as a Technical Product Manager at Nike. While Nike not be anyone's definition of a pure tech company, I'm excited to work for a company that helps people reconnect with themselves, their bodies, and the world around them. It's a place I'm getting exposed to e-commerce, mobile and web development, marketing personalization, and tools for supply-chain management. It's already a leader in metaverse experiences and in NFTs. And it's a place that makes digital (and physical) tools that just disappear into the background, so you can focus on the job at hand. And that's why I decided to start my tech career here.
Read more →