Wednesday Evening Edition

35+

Tweets

Topics

16h

Window

Google Stitch Arrives

1 item

Google launched Stitch, an AI-native design platform with five major upgrades that transform it from a prompt-to-UI generator into a spatial, multimodal, autonomous design environment — and it hit 2.4M views in hours.

The upgrades include AI-Native Canvas, Design Systems integration, and DESIGN.md support. The product walkthrough video is rolling out now. At 10K likes and 1.5K retweets, this is one of Google's most engaged product launches in months — notable because it targets the design-to-code pipeline that Figma, Vercel's v0, and a dozen startups have been fighting over. Google entering with a full design-system-aware tool raises the bar significantly.

@stitchbygoogle · 5h, 2.4M views, 10K likes — the engagement dwarfs most AI tool launches, suggesting designers have been waiting for this

The Self-Improving Machine

4 items

MiniMax released M2.7, describing "Early Echoes of Self-Evolution" — a model whose RL training loop literally runs and iterates experiments on its own, with minimal human intervention.

The blog post and accompanying workflow diagram show a five-stage pipeline: Experiment Plan → Experiment Dev & Run → Analyze & Report → Review & Discuss → Iterate Loop. The "auto-continue" arrow between stages is the key detail — the agent doesn't stop between iterations. Ara called it "borderline singularity level ML achieved in open-source," noting that the model writes code to upgrade itself, fixes live server crashes in under 3 minutes, and wins gold medals in ML competitions.

@MiniMax_AI · 4h, @arafatkatze · 4h, @wildmindai · 3h — combined 27K+ views

ClawTeam demonstrated fully autonomous ML training: a single prompt spawns 8 agents that self-coordinate across 8 H100s, run 2,430+ experiments, and achieve a 6.4% performance boost with zero human intervention.

Chao Huang's post describes what is essentially AI iterating on AI — the agents manage GPU allocation, experiment design, hyperparameter search, and evaluation autonomously. The upstream enabler is the same agent-orchestration infrastructure (OpenClaw, Codex subagents) that shipped last week, now applied to the training loop itself rather than just coding tasks.

@huang_chao4969 · 8h

A Zhejiang University team built SkillNet — an open infrastructure with 200,000 reusable AI skills designed to solve the "every agent forgets everything" problem.

The paper addresses a fundamental limitation: current agents operate in complete isolation, reinventing the wheel on every task. SkillNet creates a repository where agents can discover, compose, and chain pre-built skills with relational connections and multi-dimensional evaluation. The team reports it improves average rewards by 47% and reduces execution steps by 30% across multiple backbone models.

@godofprompt · 13h

A researcher distilled Claude Opus 4.6's reasoning into an open-source Qwen 3.5 27B model — and it immediately attracted 858 likes on HuggingFace.

The model, "Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled," was trained on filtered reasoning traces from Opus and is Apache 2.0 licensed. The demand for frontier-model reasoning in smaller, runnable packages is clearly intense — this is the same dynamic that drove GPT-5.4 mini/nano yesterday, but from the open-source community working backward from Claude's outputs.

@Jackrong on HuggingFace · 6 replies, 45 retweets, 305 likes, 19K views

Claude Ecosystem — Day Two

3 items

Garry Tan launched /office-hours, a Claude skill that simulates YC's startup evaluation thinking — and Jed White called it "essentially YC open-sourcing some of the thinking they use to help startups grappling with new ideas."

The skill runs on GStack and includes a companion /debug skill that follows what Tan calls the Iron Law: no fixes without root cause investigation first. It traces data flow, matches against known bug patterns, and if three fixes fail, stops and questions the architecture. The meta-story here is the YC president personally building Claude skills, signaling that the skills ecosystem has reached institutional adoption.

@garrytan · 7h, @jedwhite · 5h, 18K views

Suryansh Tiwari's breakdown of Anthropic's internal Claude Code skills playbook hit 16h of sustained engagement, with a detailed terminal screenshot showing the full methodology: Plan Mode Default, Subagent Strategy, Self-Improvement Loop, Verification Before Done, Demand Elegance, and Skills as Context Engine.

The key insight from the playbook: skills are not text files — they are modular systems the agent can explore and execute, including code, scripts, data, and workflows. The "Avoid Over-Constraining AI" section advises providing context rather than micromanagement, and letting AI adapt to problems with flexibility over strict instructions.

@Suryanshti777 · 16h

Charly Wargnier framed Anthropic's Dispatch as a direct competitor to OpenClaw, describing it as "a new research preview in Claude Cowork that completely changes how you interact with AI" — pair your phone to a persistent desktop session and message tasks on the go.

The Dispatch-vs-OpenClaw framing is interesting because it reveals how quickly the market is converging on "persistent agent sessions" as the next interface paradigm, with both Anthropic and OpenAI shipping variants within days of each other.

@DataChaz · 2h, 9.2K views

Agent Reliability

2 items

Clare Liguori tested 5 approaches to guiding AI agent behavior across 3,000 eval runs — and Strands steering hooks was the only method that achieved 100% accuracy.

The results are striking: steering hooks hit 100% across 600 eval runs, compared to 82.5% for simple prompt-based instructions and 80.8% for graph-based workflows. The key mechanism is "just-in-time guidance" — injecting instructions to the model immediately before tool calls rather than upfront in the system prompt. The paper from strandsagents.com suggests that the problem with most agent frameworks isn't the model but the timing of when guidance is delivered.

@clare_liguori · 3h, 2.7K views

Chinese researchers proposed "Attention Residuals" to fix what they call a 10-year-old flaw in every major language model — the residual connection.

The Kivi Team paper argues that standard residual connections blindly stack layer outputs, causing hidden-state noise to accumulate with depth. Their approach, Attention Residuals (AttRes), replaces blind accumulation with softmax attention over preceding layer outputs, allowing selective aggregation. They report preserving most gains of full Attention while adding minimal overhead, and scale validation up to 6.5 billion parameters.

@simplifyinAI · 12h, 35K views

AI, Work & Society

3 items

Andrew Yang declared "The Fuckening of white-collar workers has arrived" — and 700K people read it.

His blog post landed at 4.1K likes and 673 retweets, making it one of the week's most-engaged takes on AI labor displacement. The bluntness of the framing stands out against the usual measured policy language — Yang has been warning about this since his 2020 presidential campaign, but the tone shift from "this is coming" to "this has arrived" reflects the acceleration in real-world job impacts that stories like the Bengaluru IT crisis are making undeniable.

@AndrewYang · Mar 17, 700K views — the engagement pattern shows the anxiety is no longer theoretical

CrowdReply crossed $5M ARR fully bootstrapped with a product that embeds business names into AI-training-data conversations — and a viral thread framing it as the birth of "Generative Engine Optimization" hit 802K views.

The business model is elegantly cynical: DM local businesses saying they're invisible to AI search, use CrowdReply to seed their name into online conversations that AI models scrape, and those businesses surface in AI-generated answers. It's SEO for the age of LLMs, and the 802K views suggest the market recognizes this as the next gold rush.

@MrBallaz · Mar 17, 802K views, 3.8K likes · @Crowdreply_io · $5M ARR

Databricks CEO Ali Ghodsi argued that Zoom has a massive, underexploited chance to become an AI-first enterprise product — because it sits on the largest dataset of meeting videos and transcripts in the world.

The insight, shared by Rohan Paul, is that the big pain in enterprise isn't generating content but extracting actionable intelligence from the unstructured conversations companies already have. Zoom's data moat — millions of hours of real business conversations — may be more valuable than its video-calling product.

@rohanpaul_ai · 1h, 2.9K views

Quiet Gems

4 items

OpenAI's Noam Brown launched "Parameter Golf" — a challenge game at openai.com/parameter-golf that tests how efficiently you can solve problems with minimal model parameters.

It's a clever piece of developer marketing that doubles as a genuine intellectual puzzle about model efficiency. The framing as a "golf" game — fewer parameters is better — neatly inverts the industry's obsession with scale.

@polynoamial · 17m, 3.8K views

Yann LeCun reposted Jon Barron's "gaussid" — a markdown conversion of Sam Roweis's classic gaussid.pdf, preserving a foundational machine learning document in a more accessible format on GitHub.

A small act of digital preservation for a paper that shaped how a generation of researchers thought about Gaussian processes.

@jon_barron · 5h, 7.5K views

Parimal drew a connection between ancient Indian knowledge systems — Samkhya, Yoga Vasistha, Kashmir Shaivism — and modern neuromorphic logic, arguing these frameworks offer a hardly-known architecture that mirrors how biological neural networks actually compute.

Filed under the feed's contemplative thread, which runs quieter than the AI product cycle but often surfaces the most interesting ideas.

@Fintech03 · 37m, 785 views

Astrogravity launched a free Vedic astrology platform generating complete birth charts (Rasi/D1 to D60), 120-year Vimshottari Dasha timelines, and multiple Ayanamsha calculations — no signup, no login.

A niche launch, but a polished one: the tool supports Lahiri, Raman, and KP ayanamshas and runs Panchanga, Muhurtha, and Tara Bala calculations. For anyone on this feed who follows Ravi's astrology thread alongside the AI signal, this is worth bookmarking.

@Astrogravity_in · 5h, 742 views

This Wednesday evening, the feed has shifted decisively from "what new models can do" to "what systems can do with models." SkillNet's 200K reusable skills, ClawTeam's self-coordinating GPU agents, Strands' steering hooks, Garry Tan's YC office-hours skill — every major story today is about orchestration, memory, and reliability rather than raw capability. The frontier moved from the model to the scaffold around it. And MiniMax's "self-evolution" paper suggests the scaffold itself is starting to evolve autonomously. Google's Stitch entry into the design-to-code space signals the same pattern in frontend tooling: the competition isn't about API quality anymore, it's about integrating design systems and autonomous iteration. Meanwhile, Andrew Yang's "Fuckening" landed with 700K views, and CrowdReply's $5M ARR on generative engine optimization signals that the market is reshaping faster than policy can adapt — the anxiety is no longer theoretical.