Big Technology Podcast: Claude Code’s Shining Moment, ChatGPT for Healthcare, End Of Busywork?
Episode webpage: https://www.bigtechnology.com/
Big Technology Podcast: Claude Code’s Shining Moment, ChatGPT for Healthcare, End Of Busywork?
Episode webpage: https://www.bigtechnology.com/
https://pmc.ncbi.nlm.nih.gov/articles/PMC11278271/
Sternberg RJ. Do Not Worry That Generative AI May Compromise Human Creativity or Intelligence in the Future: It Already Has. J Intell. 2024 Jul 19;12(7):69. doi: 10.3390/jintelligence12070069. PMID: 39057189; PMCID: PMC11278271.
The history of digital computation has been dominated by the paradigm of general-purpose processing. For nearly half a century, the Central Processing Unit (CPU) served as the universal engine of the information age. This universality was its greatest strength and, eventually, its critical weakness. As the mid-2010s approached, the tech industry faced a collision of two tectonic trends: the deceleration of Moore’s Law and the explosive growth of Deep Learning (DL).
This report provides an exhaustive analysis of Google’s response to this collision: the Tensor Processing Unit (TPU). It traces the lineage of the TPU from clandestine experiments on “coffee table” servers to the deployment of the Exascale-class TPU v7 “Ironwood.” It explores the architectural philosophy of Domain Specific Architectures (DSAs) championed by Turing Award winner David Patterson, arguing that the future of computing lies not in doing everything reasonably well, but in doing one thing—matrix multiplication—with absolute efficiency.
Through a detailed examination of seven generations of hardware, this report illustrates how the TPU enabled the modern AI revolution, powering everything from AlphaGo to the Gemini and PaLM models. It details technical specifications, the shift to liquid cooling, the introduction of optical interconnects, and the “Age of Inference” ushered in by Ironwood. The analysis suggests that the TPU is a vertically integrated supercomputing instrument that allowed Google to decouple its AI ambitions from the constraints of the merchant silicon market.
To understand the genesis of the TPU, one must understand the physical constraints facing machine learning pioneers in the early 2010s. Before the era of polished cloud infrastructure, the hardware reality for deep learning researchers was chaotic and improvised.
In 2012, Zak Stone—who would later found the Cloud TPU program—operated in a startup environment that necessitated extreme frugality. To acquire the necessary compute power for training early neural networks, Stone and his co-founders resorted to purchasing used gaming GPUs from online marketplaces. They assembled these disparate components into servers resting on their living room coffee table. The setup was so power-inefficient that turning on a household microwave would frequently trip the circuit breakers, plunging their makeshift data center into darkness.1
This “coffee table” era serves as a potent metaphor for the state of the industry. The hardware being used—GPUs designed for rendering video game textures—was accidentally good at the parallel math required for AI, but it was not optimized for it.
By 2013, deep learning was moving from academic curiosity to product necessity. Jeff Dean, Google’s Chief Scientist, performed a calculation that would become legendary in the annals of computer architecture. He estimated the computational load if Google’s user base of 100 million Android users utilized the voice-to-text feature for a mere three minutes per day. The results were stark: supporting this single feature would require doubling the number of data centers Google owned globally.1
This was an economic and logistical impossibility. Building a data center is a multi-year, multi-billion dollar capital expenditure. The projection revealed an existential threat: if AI was the future of Google services, the existing hardware trajectory (CPUs) and the alternative (GPUs) were insufficient.
Faced with this “compute cliff,” Google initiated a covert hardware project in 2013 to build a custom Application-Specific Integrated Circuit (ASIC) that could accelerate machine learning inference by an order of magnitude.4
Led by Norm Jouppi, a distinguished hardware architect known for his work on the MIPS processor, the team operated on a frantic 15-month timeline.4 They prioritized speed over perfection, shipping the first silicon to data centers without fixing known bugs, relying instead on software patches. The chip was packaged as an accelerator card that fit into the SATA hard drive slots of Google’s standard servers, allowing for rapid deployment without redesigning server racks.4
For nearly two years, these chips—the TPU v1—ran in secret, powering Google Search, Translate, and the AlphaGo system that defeated Lee Sedol in 2016, all while the outside world remained oblivious.3
The TPU v1 was a domain-specific accelerator designed strictly for inference.
The defining feature of the TPU is the Matrix Multiply Unit (MXU) based on a Systolic Array. Unlike a CPU, which constantly reads and writes to registers (the “fetch-decode-execute-writeback” cycle), a systolic array flows data through a grid of Multiplier-Accumulators (MACs) in a rhythmic pulse.
In the TPU v1, this array consisted of 256 x 256 MACs.6
The result:
The TPU v1 aggressively used Quantization, operating on 8-bit integers (INT8) rather than the standard 32-bit floating-point numbers.6 This decision quadrupled memory bandwidth and significantly reduced energy consumption, as an 8-bit integer addition consumes roughly 13x less energy than a 16-bit floating-point addition.7
When published in 2017, the specifications revealed a processor radically specialized compared to its contemporaries.
| Feature | TPU v1 | NVIDIA K80 GPU | Intel Haswell CPU |
|---|---|---|---|
| Primary Workload | Inference (INT8) | Training/Graphics (FP32) | General Purpose |
| Peak Performance | 92 TOPS (8-bit) | 2.8 TOPS (8-bit) | 1.3 TOPS (8-bit) |
| Power Consumption | ~40W (Busy) | ~300W | ~145W |
| Clock Speed | 700 MHz | ~560-875 MHz | 2.3 GHz+ |
| On-Chip Memory | 28 MiB (Unified Buffer) | Shared Cache | L1/L2/L3 Caches |
Data compiled from.4
The TPU v1 achieved 92 TeraOps/second (TOPS) while consuming only 40 Watts, providing a 15x to 30x speedup in inference and a 30x to 80x improvement in energy efficiency (performance/Watt) compared to contemporary CPUs and GPUs.6
The technical success of the TPU v1 validated the theories of David Patterson, a Turing Award winner who joined Google as a Distinguished Engineer in 2016.8
Patterson argued that Moore’s Law (transistor density) and Dennard Scaling (power density) were failing. Consequently, the only path to significant performance gains—10x or 100x—was through Domain Specific Architectures (DSAs).10
The TPU is the archetypal DSA. By removing “general purpose” features like branch prediction and out-of-order execution, the TPU devotes nearly all its silicon to arithmetic. Patterson noted that for the TPU v1, the instruction set was CISC (Complex Instruction Set Computer), sending complex commands over the PCIe bus to avoid bottlenecking the host CPU.6
To free itself from NVIDIA GPUs, Google needed a chip capable of training, which requires higher precision (floating point) and backpropagation.
Introduced in 2017, the TPU v2 was a supercomputing node featuring:
Google researchers invented the bfloat16 (Brain Floating Point) format for TPU v2. By truncating the mantissa of a 32-bit float but keeping the 8-bit exponent, they achieved the numerical stability of FP32 with the speed and memory density of FP16.14 This format has since become an industry standard.
The TPU v3 pushed peak compute to 123 TFLOPS per chip.15 However, the power density was too high for air cooling. Google introduced liquid cooling directly to the chip, allowing v3 Pods to scale to 1,024 chips and deliver over 100 PetaFLOPS.16
For the Large Language Model (LLM) era, Google needed exascale capabilities.
TPU v4 introduced Optical Circuit Switches (OCS). Instead of electrical switching, OCS uses MEMS mirrors to reflect light beams, reconfiguring the network topology on the fly (e.g., from 3D Mesh to Twisted Torus).18 This allowed v4 Pods to scale to 4,096 chips and 1.1 exaflops of peak compute.18
To accelerate recommendation models (DLRMs), which rely on massive embedding tables, TPU v4 introduced the SparseCore. These dataflow processors accelerated embeddings by 5x-7x while occupying only 5% of the die area.19
The v4 Pods were used to train PaLM (Pathways Language Model) across two Pods simultaneously, achieving a hardware utilization efficiency of 57.8%.20
In 2023, Google bifurcated the TPU line to address different market economics.
Announced in May 2024, Trillium (TPU v6e) focused on the “Memory Wall.”
In April 2025, Google unveiled its most ambitious silicon to date: TPU v7, codenamed Ironwood. While previous generations chased training performance, Ironwood was explicitly architected for the “Age of Inference” and agentic workflows.
Ironwood represents a massive leap in raw throughput and memory density, designed to hold massive “Chain of Thought” reasoning models in memory.
A single Ironwood pod can scale to 9,216 chips.14 Google claims Ironwood delivers 2x the performance per watt compared to the already efficient Trillium (v6e). This efficiency is critical as data centers face power constraints; Ironwood allows Google to deploy agentic models that “think” for seconds or minutes (inference-heavy workloads) without blowing out power budgets.
David Patterson and Jeff Dean championed the metric of Compute Carbon Intensity (CCI).22 Their research highlights that the vertical integration of the TPU—including liquid cooling and OCS—reduces the carbon footprint of AI. TPU v4, for instance, reduced CO2e emissions by roughly 20x compared to contemporary DSAs in typical data centers.20
| Feature | TPU v1 (2015) | TPU v4 (2020) | TPU v5p (2023) | TPU v6e (2024) | TPU v7 (2025) |
|---|---|---|---|---|---|
| Codename | – | Pufferfish | – | Trillium | Ironwood |
| Use Case | Inference | Training | LLM Training | Efficient Training | Agentic Inference |
| TFLOPS | 0.092 (INT8) | 275 (BF16) | 459 (BF16) | 918 (BF16) | ~2,300 (BF16) |
| HBM | – | 32 GB | 95 GB | 32 GB | 192 GB |
| Bandwidth | 34 GB/s | 1,200 GB/s | 2,765 GB/s | 1,600 GB/s | 7,400 GB/s |
| Pod Size | N/A | 4,096 | 8,960 | 256 | 9,216 |
Table compiled from 6, 15, 21.
The TPU is not merely a chip; it is a “Silicon Sovereign.” From the coffee table to the Ironwood pod, Google has successfully decoupled its AI destiny from the merchant market, building a machine that spans from the atom to the algorithm.
I have worked in high tech for 30 years, building products for public companies and training thousands of software engineers to be productive.
In 202, I have tutored 331 hours with tutor middle school, highschool, college students in computer science.
My most recent students are adults who are not computer engineers but they want to use tools such as Cursor, Windsurf, Claude code to build websites and apps because they have an idea and they heard that Vibe Coding is possible for them.
One such example is someone who wanted to build a desktop app to take an image and the app will do automatic dimensioning. Within an hour I helped him install and setup Claude Code. We launched Claude Code inside of the terminal and I guide him to first create a product requirements document (PRD) for the product he wanted.
Then I asked him to prompt the AI agent to create an engineering implmentation plan. Given these two documents, AI has all the context it needs.
The next prompt is to just to use ‘@’ to mention the implementation.md file and after just two prompts to following the @implementation.md, my student had a fulling working Python GUI application that will do automated edge detection and create a measurement overlay.
The level of quality after only 2 prompts using Claude Code surprised even me who has been been using Claude Code for over 5 months on a daily basis.
If you are someone who is thinking about building a custom app for yourself, here are my suggestions
1) Use Claude Code, $20/ month is well worth the cost
2) Build something that you will use every day, otherwise you will not have the motivation to continue
3) Be ready to spend at least 40 hours prompting and learning about the capabilities of AI coding and more importantly the limitation
4) My rule of thumb is that it will take you 40 hours to build a small app, 80 hours to build a medium size app.
Keep building!
Today at 3:30pm, I fired off this prompt to Claude Code For Web, fingers crossed.
As a senior engineer, orchestrate the full implementation of the shared pantry feature docs/features/HOUSEHOLD_SHARING_IMPLEMENTATION.md

On this birthday, I gift myself the space, time and resources to learn. Recognizing the uncomfortable space in stomach when I’m stretching myself to learn something new. The fear of failing in front of someone while I’m pushing my limits. Then come the break through when I’ve finally master a new skill, that feeling of growth, reaching a new step and looking and seeing what’s beyond from a new vantage point.
In the everyday moments, I would like to remind my future self that there is something to be learned. When talking with an artist in the Holualoa Kona Coffee Festival, learning about how she works with block prints and getting multiple colors printed, I’m in awe of something new I learned.
When I tutor a college student who is learning C, I learn about how they misunderstand pointers that come so naturally to me after looking at C code for 30 years. ( thinking that int * char name; the ‘*’ is not part of the ‘int *’ but the ‘*’ is associated with ‘*name’ and it completely changed her understanding)
I learned a little bit about growing cocoa growing by looking at our own tress and the little cocoa babies that are able to grow from the hundreds of flowers.
Here is my birthday week celebration!
I have been using AI Coding tools daily to prototype and launch productive level apps and websites. I have also been tutoring non-computer science folks through my Wyzant tutoring business how how to use AI to product the app that they have always wish would exist in the world. One high school student is building a community golf app in New York. Another student is building a mental health app that is guided by a cute animated dragon.
I have lived through the dot-com bubble in the 2000s, and YES this generative AI period does feel like a bubble.
Ben Thompson wrote a fantastic piece of research on the The Benefits of Bubbles
if AI does nothing more than spur the creation of massive amounts of new power generation it will have done tremendous good for humanity. … To that end, I’m more optimistic today than I was even a week ago about the AI bubble achieving Perez-style benefits: power generation is exactly the sort of long-term payoff that might only be achievable through the mania and eventual pain of a bubble, and the sooner we start feeling the financial pressure — and the excitement of the opportunity — to build more power, the better.
Today I was working on the students who in college was struggling with being able to read a project spec or a project problem set and knowing where to go and tackle the problem. So what I coach them on doing is rather than focusing on the larger problem sets, try to break down the problem to more concrete examples.
In this particular assignment, his task is to read a binary file which has integers stored in “big-endian” format and print out the numbers in hexadecimal and decimal and break down the bytes of the integer in the 4 bytes and printing those out either as printable characters or as the actual number in hexadecimal
As a tutor, my value add is to be able to tailor a lesson specific to where to student is at. Not too easy and especially not too hard.
He seem to be understand some of the basics of numbers, hexadecimal, C syntax.
I started to just have him hardcode a number based on the example from the assignment. Then asked him to use printf(“0x8X”, num); to print the number in hex and printf(“0x8u”, num) to print the number in decimal.
Then had him create another unsigned in to represent byte1 and hardcoded it and printing it. Then byte2, hard coding it. He was comfortable here.
I asked him to print out byte1 as if isprintable() and print out the decimal value if not.
While this all hardcoded, I told him it will build up to something more later.
Next is to read all the integer from the file, using a while loop and fread();
Once the read loop works, he can replace the hardcoded num with what is being read from the file in the loop.
He can see how the solution is built from bottoms up.
Next up, to extract the 4 bytes from the integer. I asked him if they learned bitwise operators. He immediately understood that the ‘&’ operator should be used, but wasn’t sure whether to use 0xF or 0xFF. I asked him about number of bits in 0xF, he was really sure. Instead of pushing him to think hard about the number of bits, I did relent and asked him given an example integer, 0xA3, which mask we should use, by really just visually looking at it.
He as a little confused about using shift right or left. So I had him write down the number in a concrete format.
0x34 A2 BB F1 and asked him to visually moving these number like a ticker tape to the right. Which became 0x00 34 A2 BB, one more time to be come 0x00 00 34 A2.. etc. He immediately figure the following
byte1 = num & 0xff;
byte2 = (num >> 8) & 0xff;
byte3 = (num >> 16) & 0xff;
byte4 = (num >> 24) & 0xff;
Today, as a tutor, I hope to have gotten better at helping a student understand how to come up with a solution to a complex problem by using concrete examples instead of thinking of the abstract.
I’ve spent a significant portion of my life dedicated to the sport of badminton. The fast-paced, precision-driven nature of the game has always appealed to the software engineer in me. The thrill of a perfectly executed drop shot, the satisfaction of a powerful smash, and the camaraderie of the badminton community have been a constant source of joy and challenge. For years, I trained, competed, and immersed myself in the world of shuttlecocks and high-tension rackets. It was a world I knew and loved, a world where I felt a sense of belonging.
But life, as it often does, has a way of presenting new paths and unexpected opportunities. After many years of living in the Bay Area, I found myself in the beautiful islands of Hawaii. The change of scenery was a welcome one, a chance to embrace a different pace of life, a new culture, and a new set of experiences. And with this change came a surprising shift in my athletic pursuits. I found myself drifting away from the familiar courts of badminton and drawn to the wide-open spaces of the tennis court.
It wasn’t a conscious decision at first, but rather a gradual realization. In Hawaii, tennis courts are everywhere. They’re in public parks, at community centers, and nestled in residential neighborhoods. The accessibility is astounding. I could walk out of my door and be on a court within minutes, a stark contrast to the often-crowded and hard-to-book badminton gyms I was used to. The sheer convenience of it all was a game-changer. No more scrambling for court time, no more waiting for a spot to open up. The courts were there, waiting to be played on, bathed in the warm Hawaiian sun.
And then there was the community. The tennis community in Hawaii is incredibly welcoming. I was a newcomer, a badminton player with a decent racket-sport foundation but a complete novice when it comes to the nuances of tennis. I was worried I wouldn’t fit in, that I’d be out of my depth. But my fears were quickly put to rest. I was met with open arms, with friendly faces eager to share their love of the game. I found hitting partners of all skill levels, from seasoned veterans to fellow beginners. There was a sense of shared passion, a collective desire to improve and enjoy the sport together. It reminded me of the camaraderie I cherished in the badminton world, but with a uniquely Hawaiian flavor of aloha.
Playing tennis outdoors in Hawaii is an experience that’s hard to put into words. It’s more than just a game; it’s a connection to nature. The feeling of the sun on your skin, the gentle trade winds rustling through the palm trees, the vibrant colors of the tropical landscape surrounding you – it’s a sensory feast. It’s a far cry from the enclosed, often sterile environment of a badminton gym. There’s a sense of freedom, of being a part of something bigger than yourself. It’s a feeling that resonates deeply with the runner in me, the part of me that finds solace and clarity in the great outdoors. It’s a feeling I’ve chased on the running trails of Castro and the streets of Paris, and now I’ve found it on the tennis courts of Hawaii.
The transition from badminton to tennis hasn’t been without its challenges. The muscle memory I’ve built up over years of playing badminton has been both a blessing and a curse. The hand-eye coordination is there, the footwork is decent, but the swing, the timing, the strategy – it’s a whole new ball game. It’s a mental and physical puzzle that I’m slowly but surely piecing together. It’s a new challenge, a new mountain to climb, and I’m embracing it with the same determination and discipline that I’ve applied to my running and my crazy push-up goals.
I’ve found that the lessons I’ve learned from my other passions are surprisingly applicable to tennis. The endurance I’ve built from running marathons helps me stay strong through long matches. The mental fortitude I’ve developed from pushing my physical limits helps me stay focused and composed under pressure. And the introspective nature of my personality, the part of me that’s always analyzing and seeking to understand, helps me break down the complexities of the game and find ways to improve.
I still have a deep love and respect for badminton. It’s a sport that has given me so much, and I’ll always cherish the memories and friendships I’ve made along the way. But for now, my heart is on the tennis court. It’s a new chapter in my athletic journey, a new adventure that I’m excited to embark on. It’s a chance to learn, to grow, and to connect with a new community in a new and beautiful place. And as I stand on the court, racket in hand, with the Hawaiian sun on my face and the sound of the ocean in the distance, I can’t help but feel a sense of gratitude and excitement for what’s to come.