Startup World

The concept of AI self-improvement has been a hot topic in recent research circles, with a flurry of papers emerging and prominent figures like OpenAI CEO Sam Altman weighing in on the future of self-evolving intelligent systems.
Now, a new paper from MIT, titled Self-Adapting Language Models, introduces SEAL (Self-Adapting LLMs), a novel framework that allows large language models (LLMs) to update their own weights.
This development is seen as another significant step towards the realization of truly self-evolving AI.The research paper, published yesterday, has already ignited considerable discussion, including on Hacker News.
SEAL proposes a method where an LLM can generate its own training data through self-editing and subsequently update its weights based on new inputs.
Crucially, this self-editing process is learned via reinforcement learning, with the reward mechanism tied to the updated models downstream performance.The timing of this paper is particularly notable given the recent surge in interest surrounding AI self-evolution.
Earlier this month, several other research efforts garnered attention, including Sakana AI and the University of British Columbias Darwin-Gdel Machine (DGM), CMUs Self-Rewarding Training (SRT), Shanghai Jiao Tong Universitys MM-UPT framework for continuous self-improvement in multimodal large models, and the UI-Genie self-improvement framework from The Chinese University of Hong Kong in collaboration with vivo.Adding to the buzz, OpenAI CEO Sam Altman recently shared his vision of a future with self-improving AI and robots in his blog post, The Gentle Singularity.
He posited that while the initial millions of humanoid robots would need traditional manufacturing, they would then be able to operate the entire supply chain to build more robots, which can in turn build more chip fabrication facilities, data centers, and so on.
This was quickly followed by a tweet from @VraserX, claiming an OpenAI insider revealed the company was already running recursively self-improving AI internally, a claim that sparked widespread debate about its veracity.Regardless of the specifics of internal OpenAI developments, the MIT paper on SEAL provides concrete evidence of AIs progression towards self-evolution.Understanding SEAL: Self-Adapting Language ModelsThe core idea behind SEAL is to enable language models to improve themselves when encountering new data by generating their own synthetic data and optimizing their parameters through self-editing.
The models training objective is to directly generate these self-edits (SEs) using data provided within the models context.The generation of these self-edits is learned through reinforcement learning.
The model is rewarded when the generated self-edits, once applied, lead to improved performance on the target task.
Therefore, SEAL can be conceptualized as an algorithm with two nested loops: an outer reinforcement learning (RL) loop that optimizes the generation of self-edits, and an inner update loop that uses the generated self-edits to update the model via gradient descent.This method can be viewed as an instance of meta-learning, where the focus is on how to generate effective self-edits in a meta-learning fashion.A General FrameworkSEAL operates on a single task instance (C,), where C is context information relevant to the task, and defines the downstream evaluation for assessing the models adaptation.
For example, in a knowledge integration task, C might be a passage to be integrated into the models internal knowledge, and a set of questions about that passage.Given C, the model generates a self-edit SE, which then updates its parameters through supervised fine-tuning: SFT(,SE).
Reinforcement learning is used to optimize this self-edit generation: the model performs an action (generates SE), receives a reward r based on LMs performance on , and updates its policy to maximize the expected reward.The researchers found that traditional online policy methods like GRPO and PPO led to unstable training.
They ultimately opted for ReST^EM, a simpler, filtering-based behavioral cloning approach from a DeepMind paper.
This method can be viewed as an Expectation-Maximization (EM) process, where the E-step samples candidate outputs from the current model policy, and the M-step reinforces only those samples that yield a positive reward through supervised fine-tuning.The paper also notes that while the current implementation uses a single model to generate and learn from self-edits, these roles could be separated in a teacher-student setup.Instantiating SEAL in Specific DomainsThe MIT team instantiated SEAL in two specific domains: knowledge integration and few-shot learning.Knowledge Integration: The goal here is to effectively integrate information from articles into the models weights.Few-Shot Learning: This involves the model adapting to new tasks with very few examples.Experimental ResultsThe experimental results for both few-shot learning and knowledge integration demonstrate the effectiveness of the SEAL framework.In few-shot learning, using a Llama-3.2-1B-Instruct model, SEAL significantly improved adaptation success rates, achieving 72.5% compared to 20% for models using basic self-edits without RL training, and 0% without adaptation.
While still below Oracle TTT (an idealized baseline), this indicates substantial progress.For knowledge integration, using a larger Qwen2.5-7B model to integrate new facts from SQuAD articles, SEAL consistently outperformed baseline methods.
Training with synthetically generated data from the base Qwen-2.5-7B model already showed notable improvements, and subsequent reinforcement learning further boosted performance.
The accuracy also showed rapid improvement over external RL iterations, often surpassing setups using GPT-4.1 generated data within just two iterations.Qualitative examples from the paper illustrate how reinforcement learning leads to the generation of more detailed self-edits, resulting in improved performance.While promising, the researchers also acknowledge some limitations of the SEAL framework, including aspects related to catastrophic forgetting, computational overhead, and context-dependent evaluation.
These are discussed in detail in the original paper.Original Paper: https://arxiv.org/pdf/2506.10943Project Site: https://jyopari.github.io/posts/sealGithub Repo: https://github.com/Continual-Intelligence/SEALLike this:LikeLoading...





Unlimited Portal Access + Monthly Magazine - 12 issues


Contribute US to Start Broadcasting - It's Voluntary!


ADVERTISE


Merchandise (Peace Series)

 


Tesollo to present humanoid robot hand at AI for Good Global Summit 2025


The curious rise of giant tablets on wheels


Rocket Report: Japan’s workhorse booster takes a bow; you can invest in SpaceX now


World-first: DJI drone movies whole Everest path in one go


DJI’s ultimate phone gimbal gets early Prime Day discount


SEW-EURODRIVE now assembles planetary gear units in the U.S.


Ready-made stem cell therapies for pets could be coming


Supplier of concealed security app spills passwords for 62,000 users


Judge: You can’t ban DEI grants without bothering to define DEI


Meta's AI superintelligence effort sounds just like its failed metaverse


The Last of Us co-creator Neil Druckmann exits HBO show


2025 VW ID Buzz review: If you want an electric minivan, this is it


Man’s ghastly festering ulcer stumps doctors—until they cut out a wedge of flesh


xAI data center gets air authorization to run 15 turbines, but imaging reveals 24 on site


Sky Elements Drone Show Aims for World Records on July 4 Celebrations


Quantum Systems and Fraunhofer FHR to Integrate State-of-the-Art Radar Technology into UAVs


The Number Of P-51 Mustangs Are LeftThe newest survivor census maintained by the lover site MustangsMustangs pegs general numbers at 311 complete airframes. Of these, 29 remain in long-lasting storage, 54 remain in active restoration hangars, 159 are sti


Buyers still waiting: DJI drones face ongoing US Customs snag


How to Set Up a Planetary Gear Motion with SOLIDWORKS


Intuitive Surgical obtains CE mark for da Vinci 5 robot


Pittsburgh Robotics Network introduces Deep Tech Institute for Leadership and Innovation


Cluely’s ARR doubled in a week to $7M, founder Roy Lee says. But rivals are coming.


Who is Soham Parekh, the serial moonlighter Silicon Valley startups can’t stop hiring


Stripe’s first employee, the founder of fintech Increase, sort of bought a bank


Why Cloudflare desires AI business to pay for content


Pinwheel introduces a smartwatch for kids that includes an AI chatbot


Castelion is raising a $350M Series B to scale hypersonic rocket service


Tighten up your cap table with Fidelity, Cimulate, and DepositLink at A Technology NewsRoom All Stage 2025


Writer CEO May Habib to take the AI Stage at A Technology NewsRoom Disrupt 2025


Israeli quantum startup Qedma just raised $26M, with IBM joining in


TikTok is being flooded with racist AI videos created by Google's Veo 3


Whatever that might go wrong with X's new AI-written neighborhood notes


New proof that some supernovae may be a double detonation


Rice might be essential to developing better non-alcoholic beer


AT T present Wireless Account Lock defense to curb the SIM-swap scourge


From Le Mans to Driven-- where does F1: The Movie rank


NYT to start searching erased ChatGPT logs after beating OpenAI in court


Paramount accused of bribery as it settles Trump suit for $16 million


Medical groups warn Senate budget bill will create dystopian health care system


Tesla Q2 2025 sales dropped more than 13% year over year


What's incorrect with AAA games The development of the next Battlefield has answers.To comprehend exactly what's happening with the next Battlefield title-- codenamed Glacier-- we need to rewind a bit. broadened the franchise audience to more directly com


Astronomers might have found a third interstellar item


RTX and Shield AI Partner to Develop New Defense Capabilities


NYPD Considers Net-Firing Drones to Take Down 'Hostile' Drones


Iran Unveils Shahed 107


China Starts Production of D18 Cargo Drone for Low-Altitude Strategic Logistics Operations


Wildlife Drones Saving Rhinos from Poachers in India’s National Parks


DJI expands Power lineup with mighty new Power 2000 station


ABB updates IRB 1200 line, adds 3 robot families for China


Galbot picks up $153M to commercialize G1 semi-humanoid


Luminous gets funding to bring LUMI solar construction robot to Australia


Wonder Dynamics co-founder Nikola Todorovic joins the AI Stage at A Technology NewsRoom Disrupt 2025


Robinhood's co-founder is beaming up (and down) the future of energy


Lovable on track to raise $150M at $2B appraisal


RFK Jr.'s health department calls Nature scrap science, cancels memberships


Pentagon might put SpaceX at the center of a sensor-to-shooter targeting network


FCC chair decides prisoners and their families should keep paying high phone rates


Moderna states mRNA flu vaccine cruised through trial, beating standard shot


Nudify app's strategy to dominate deepfake porn depends upon Reddit, docs show


Nothing Phone 3 gets here July 15 with a small dot matrix rear display


United States crucial facilities exposed as feds caution of possible attacks from Iran


White House works to ground NASA science objectives before Congress can act


Glen Powell plays a hazardous game in The Running Man trailer


Ted Cruz plan to penalize states that control AI shot down in 99-1 vote


GOP desires EV tax credit gone; it would be a catastrophe for Tesla


GOP budget expense poised to squash renewable resource in the US


Tuesday Telescope: A howling wolf in the night sky


Pay up or stop scraping: Cloudflare program charges bots for each crawl


Silvus Technologies Launches Spectrum Dominance 2.0 Next Generation EW Defenses


France's XSun and H3 DYNAMICS Join Forces to Develop World's First Solar Hydrogen Electric UAV


Ukraine’s New Drone Built to Kill Shaheds


Russia's Weapons Stockpile: How Many Missiles and Drones are Left


Parry Labs and Airbus Partner on United States Marine Corps' Unmanned Aerial Logistics Connector


Top 10 robotics advancements of June 2025


Farmer-first future: Agtonomy's technique to clever farming


Genesis AI brings in $105M to build universal robotics foundation design


Amazon releases new AI structure model, releases 1 millionth robotic


Civ Robotics areas Series A funding for automated surveying


Figma moves closer to a blockbuster IPO that could raise $1.5 B


Roadway to Battlefield: Central Eurasia's entrance to A Technology NewsRoom Startup Battlefield


David George from a16z on the future of going public at A Technology NewsRoom Disrupt 2025


Mo Jomaa breaks down IPO preparation for creators on the Scale Stage at A Technology NewsRoom All Stage


Genesis AI introduces with $105M seed funding from Eclipse, Khosla to build AI models for robots


A mammoth tusk boomerang from Poland is 40,000 years old


Analyst: M5 Vision Pro, Vision Air, and smart glasses coming in 2026–2028


Research study roundup: 6 cool science stories we nearly missed out on


Drug cartel hacked FBI official’s phone to track and kill informants, report says


Half a million Spotify users are unknowingly grooving to an AI-generated band


Senate GOP budget plan expense has little-noticed arrangement that might harm your Wi-Fi


Texas politicians advance in effort to wrench space shuttle bus from Smithsonian


Nearly 12 million individuals would lose medical insurance under Senate GOP expense


Project Hail Mary trailer looks like a winner for Andy Weir fans


Meta, TikTok can’t toss wrongful death suit from mom of “subway surfing” teen


Supreme Court to choose whether ISPs need to disconnect users accused of piracy


Trump's tariff threat pushes Canada to scrap digital services tax


NIH budget cuts affect research study funding beyond US borders


The second launch of New Glenn will aim for Mars


Android 16 review: Post-hype


Cops Helicopter Chasing Drones Near United States Air Base in Near Miss with F-15


ZeroAvia Gets UK Government Grant for Development and Flight Test of Liquid Hydrogen Fuel System


Shield AI and Amazon Web Services Collaborate to Deliver Mission Autonomy at Fleet Scale


Raspberry Pi Powers Next-Gen UAV Swarm Intelligence


US Air Force Reaper Drones to Test New Anti-Hacking Software


FAA approves AVSS parachute for DJI Matrice 4 drones


Shell extends multi-million dollar deal with drone firm Cyberhawk


DJI simply revealed its most effective delivery drone yet


Joby Aviation (JOBY) begins piloted eVTOL flights in the United Arab Emirates [Video]


Unitree ends up being a legged robotic unicorn with Series C financing


Tacta Systems raises $75M to give robots a ‘smart nervous system’


Sri Mandir keeps investors hooked as digital devotion grows


Legal software company Clio drops $1B on law data giant vLex


Next-gen procurement platform Levelpath catches $55M


From $5 to financial empowerment: Why Stash co-founder Brandon Krieg is a must-see at A Technology NewsRoom All Stage 2025


Tailor, a 'headless' ERP start-up, raises $22M Series A


Ex-Meta engineers have actually built an AI tool to plan every information of your trip


3 powerhouses cover how to prepare now for your later-stage raise at A Technology NewsRoom Disrupt 2025


Not simply luck-- it's method: Tiffany Luck on winning over VCs at A Technology NewsRoom All Stage


Tiny AI ERP startup Campfire is winning numerous start-ups from NetSuite, Accel led a $35M Series A


Jennifer Neundorfer on how AI is reshaping the way startups are built — live at A Technology NewsRoom All Stage


Kristen Craft brings fresh fundraising strategy to the Foundation Stage at A Technology NewsRoom All Stage