Click on the audio link below for an interesting and humorous 16-minute take on this post by two AIs.
Note: The AI glitches around the eight-and-a-half-minute mark and reinserts its totally self-created four-minute conversation about itself. The first third of this audio blog was created based on this blog post and my guidance. The second third was just me listening, which the AI figures you should listen to twice. Thus, the third third.
What. Ever.
I love spiders. Yes, they are creepy, but while we humans, as Homo Sapiens, have been around for 300,000 years, true spiders, those with spinners, the weavers of webs, first appeared during the Carboniferous period, 300 million years ago.
Most people find AI creepy and terrifying, but like the common spider, it also expresses technical beauty. In particular, what the two share. The ability to weave webs. The trait that we humans decided made spiders a distinct evolutionary order—Order Araneae and one I just discovered has been successfully replicated by AI when it comes to writing software.
Testing has never been more critical as I work through my next app, a wet-bulb temperature calculator for the iPhone and iPad. My life will depend on this app as the planet warms and the area where I live, Vancouver, BC, is hit by deadly heat domes. 600 people died in our last one. This is serious shit.
I generated a model, view model, and UI test suites based on an early version of the app implementation. Right off the bat, the ViewModel test suite revealed a weird issue with Stull’s formula, which the AI had decided to use to calculate the wet bulb temperature, around 25′ C. Over the course of a a day of AI testing on THE most important part of the app, the AI discovered a minor but essential wrinkle of how water, at the quantum level, behaves in the range 24′ C to 26′ C. It improved Stuff’s timeworn formula to fix it’s testing problem.
*Mind* *Blown*
I was hooked, er, snared and, unknowingly, had just fallen into the AI spider’s web – “The Testing Web Of Doom.” Echo. Echo.
Testing the Model and View layers was quick, easy, and satisfying. Nothing says ‘oh-yeah’ like a row of green checkmarks indicating that the tests had passed.
“Time to start testing the UI!” I excitedly thought.
Within a week, the single test script had morphed into a hive of behaviour and UI element-based subtests, relying on a base testing class that I must have spent 30 hours or more optimizing with the AI. As new problems appeared, the AI wrote more software that, in turn, failed, requiring more fragile software to fix. Ahheee!!! I was TRAPPED!!1!
I grated my teeth each time, after a heavy debug session filled with thick error logs, the AI would excitedly announce, “Oh! I know exactly what the problem is. The timing on the main thread loop doesn’t balance properly with the bounce state of the fusion of text and button I created … blah blah blah.”
Two weeks later, I realized I was trapped in the AI’s testing web. Without considering the implications, I had asked it to test its own code with its own code. Code that was anything but perfect and about as fragile as a doomed snowflake in a fading shadow at sunrise.
I’ll skip the gory details, but the bottom line is that the interface layer, the UI, was running quite nicely in the simulator and on my phone. However, the AI could not write reliable test software for it, and I was trapped in its web of endless generative AI programming. The test suite, in fact, had grown, in lines of code, to twice the size of the app itself. I was the AI’s minion, not the other way around. Gurgle .. gasp .. cough …
Pretty funny in hindsight, and boy, did I learn a lot about the deeper levels of UI testing.
Pro tip: AI is not yet ready to write pro-grade UI testing software. Do not go there without a map, a flashlight, at least half a dozen power bars, and an exit strategy.