Payout Date: April 1, 2024
Total grants: USD 5,363,105
Number of grantees: 141
Discussion: EA Forum comments
Introduction
This payout report covers the Long-Term Future Fund's grantmaking from May 1 2023 to March 31 2024 (11 months). It follows our previous April 2023 payout report.
- Total funding recommended: $6,290,550
- Total funding paid out: $5,363,105
- Number of grants paid out: 141
- Acceptance rate (excluding desk rejections): 159/672 = 23.7%
- Acceptance rate (including desk rejections): 159/825 = 19.3%
- Report authors: Linchuan Zhang (primary author), Caleb Parikh (fund chair), Oliver Habryka, Lawrence Chan, Clara Collier, Daniel Eth, Lauro Langosco, Thomas Larsen, Eli Lifland
25 of our grantees, who received a total of $790,251, requested that our public reports for their grants are anonymized (the table below includes those grants). 13 grantees, who received a total of $529, 819, requested that we not include public reports for their grants. You can read our policy on public reporting here.
We referred at least 2 grants to other funders for evaluation.
Highlighted Grants
(The following grants writeups were written by me, Linch Zhang. They were reviewed by the primary investigators of each grant).
Below, we highlighted some grants that we thought were interesting and covered a relatively wide scope of LTFF’s activities. We hope that reading the highlighted grants can help donors make more informed decisions about whether to donate to LTFF.[1]
Gabriel Mukobi ($40,680) - 9-month university tuition support for technical AI safety research focused on empowering AI governance interventions
The Long-Term Future Fund provided a $40,680 grant to Gabriel Mukobi from September 2023 to June 2024, originally for 9 months of university tuition support. The grant enabled Gabe to pursue his master's program in Computer Science at Stanford, with a focus on technical AI governance.
Several factors favored funding Gabe, including his strong academic background (4.0 GPA in Stanford CS undergrad with 6 graduate-level courses), experience in difficult technical AI alignment internships (e.g., at the Krueger lab), and leadership skills demonstrated by starting and leading the Stanford AI alignment group. However, some fund managers were skeptical about the specific proposed technical research directions, although this was not considered critical for a skill-building and career-development grant. The fund managers also had some uncertainty about the overall value of funding Master's degrees.
Ultimately, the fund managers compared Gabe to marginal MATS graduates and concluded that funding him was favorable. They believed Gabe was better at independently generating strategic directions and being self-motivated for his work, compared to the median MATS graduate. They also considered the downside risks and personal costs of being a Master's student to be lower than those of independent research, as academia tends to provide more social support and mental health safeguards, especially for Master's degrees (compared to PhDs). Additionally, Gabe's familiarity with Stanford from his undergraduate studies was seen as beneficial on that axis. The fund managers also recognized the value of a Master's degree credential for several potential career paths, such as pursuing a PhD or working in policy. However, a caveat is that Gabe might have less direct mentorship relevant to alignment compared to MATS extension grantees.
Outcomes: In a recent progress report, Gabe noted that the grant allowed him to dedicate more time to schoolwork and research instead of taking on part-time jobs. He produced several new publications that received favorable media coverage and was accepted to 4 out of 6 PhD programs he applied to. The grant also allowed him to finish graduating in March instead of June. Due to his early graduation, Gabe will not need to use the entire granted amount, saving us money.
Joshua Clymer ($1,500) - Compute funds for a research paper introducing an instruction-following generalization benchmark
The Long-Term Future Fund provided a $1,500 grant to Joshua Clymer for compute funds to rent A100 GPUs for experiments on instruction-following generalization. Although Clymer had previously worked on AI safety field-building and communications, this was his first technical AI safety project.
The fund was interested in whether models trained within a specific distribution can generalize "correctly" out of distribution, such as on benchmarks like TruthfulQA, instead of learning the idiosyncrasies of human evaluators. More broadly, the fund wanted to ensure that models can faithfully follow human instructions even when operating in vastly different contexts from their training data. The grantee believed that training models on general instruction-following could be a plausible approach to aligning AI systems with insufficiently specified rewards.
The proposal employed a method called "sandwiching," where the model is trained on data from a non-expert who cannot evaluate the model's performance, but the later evaluation is conducted by an expert. We were excited about the grant as it was cheap and tractable, while addressing an obvious difficulty in AI alignment, making it an attractive funding opportunity.
Outcomes: The paper and benchmark have been published, with the authors finding that reward models do not inherently learn to evaluate 'instruction-following' and instead favor personas that resemble internet text. In other words, models trained to follow instructions on easy tasks do not naturally follow instructions on hard tasks, even when we are fairly confident the model has the knowledge to answer the questions correctly. You can also check out Joshua’s thoughts on the project on the alignment forum.
Logan Smith ($40,000) - 6-month stipend to create language model (LM) tools to aid alignment research through feedback and content generation
The Long-Term Future Fund has been supporting Logan Smith for 2 years, providing a stipend of $40,000 every 6 months. Recently, Logan and their team published exciting mechanistic interpretability results where they used Sparse Autoencoders (SAEs) to find highly interpretable directions in language models. The fund has also supported Hoagy Cunningham, the first author of the linked paper, for the same work.
This work was quite similar to research later published by Anthropic, which generated significant excitement in the AI safety community (and was a precursor to the recent excitement of Golden Gate Claude). You may find it helpful to read this post comparing their two papers. I believe some of the independent researchers funded by the Long-Term Future Fund to work on SAEs were subsequently hired by Anthropic to continue their interpretability work at Anthropic.
Although I (Linch) am not an expert in the field, my impression is that the work on sparse autoencoders, both from independent researchers and Anthropic, represents some of the most meaningful advances in AI safety and interpretability in 2023.
Note: When an earlier private version of these notes was circulated, a senior figure in technical AI safety strongly contested my description. They believe that the Anthropic SAE work is much more valuable than the independent SAE work, as both were published around the same time, but the Anthropic work provides sufficient evidence to be worth extending by other researchers, whereas the independent research was not dispositive. I find these arguments plausible but not overwhelmingly convincing. Unfortunately, I lack the technical expertise to be well-equipped to form an accurate independent assessment on this matter.
Alignment Ecosystem Development ($99,330) - 1-year stipend for 1.25 FTEs to build and maintain digital infrastructure for the AI safety ecosystem, plus AI Safety-related domains and other expenses
The Long-Term Future Fund provided a $99,330 grant to Alignment Ecosystem Development (AED), an AI safety field-building nonprofit, through their fiscal sponsor, Ashgro Inc. The grant covered a 1-year stipend for 1.25 full-time equivalents (FTEs) to build and maintain digital infrastructure for the AI safety ecosystem, as well as expenses related to AI safety domains and other assorted costs. AED has built, taken ownership of, closely partnered with, and/or maintained approximately 15 different projects, including the AI Safety Map (aisafety.world), AI Safety Info (aisafety.info), and AI Safety Quest (aisafety.quest), with the overarching objective of growing and improving the AI safety ecosystem. Bryce Robertson, a volunteer, currently works on the ecosystem full-time and would like to transition into a paid role. The grant allocated $66,000 for stipends and the remaining funds for various software and other expenses.
Fund managers had differing opinions on whether this grant represented a good use of marginal funds. The primary investigator strongly supported the grant, while two other fund managers who examined it in detail were not convinced that it met the fund's high funding bar. Habryka, one of the fund managers, expressed excitement for the grant primarily because it continued the work that AED leadership had historically done in the AI safety space. Although not particularly enthusiastic about any individual project in the bundle, Habryka had heard of many people benefiting from AED's support in various ways and held a prior belief that good infrastructure work in this space often involves providing a significant amount of illegible and diffuse support. For example, Habryka noted that plex from AED had been quite useful in moderating and assisting with Rob Miles's Discord, which Habryka considered a valuable piece of community infrastructure, contributing positively to the impact of Rob Miles's videos in informing people about AI safety.
Arguments in favor of the grant included the websites being useful pieces of community infrastructure that can help people orient towards AI safety and the historically surprising popularity of AED's work. However, some fund managers raised concerns that AED might be spreading themselves too thin by working on many different projects rather than focusing on just one. They also questioned the quality of the outputs and worried that if newcomers' first exposure to AI safety was through AED's work, they might become unimpressed with the field as a result.
In accordance with the fund's procedure, the project was put to a vote, and the grant ultimately passed the funding bar. At a more abstract level, the fund's general policy or heuristic has been to lean towards funding when one fund manager is very excited about a grant, and other fund managers are more neutral. The underlying implicit model here is that individual excitement is more likely to identify grants with the potential for significant impact or "hits" in a hits-based giving framework, compared to grant decisions with consensus decisions where everybody is happy about a grant, but no one is actively thrilled. The underlying philosophy is explicated further in an earlier comment I wrote in January 2021, about a year before I joined LTFF.
Lisa Thiergart and Monte MacDiarmid ($40,000) - Conference publication of using activation addition for interpretability and steering of language models
The Long-Term Future Fund provided a $40,000 grant, administered by FAR AI, to Lisa Thiergart and Monte MacDiarmid for converting their research on activation addition for interpretability and steering of language models into a format suitable for academic audiences. At the time of funding, the key points of the research had already been completed and written up on LessWrong, including "Interpretability of a maze-solving network" and "Steering GPT2-XL using activation addition." The grant enabled the team to work with external writers to translate their work for academic publication.
The fund managers identified several points in favor of this grant. Firstly, the research work itself was of high quality. Secondly, having the work accepted and critiqued by academic audiences seems valuable for both increasing the rigor of the alignment subfield and encouraging more mainstream machine learning researchers to work on alignment-related issues. Lastly, working with external authors can be a valuable experiment in creating a reliable pipeline for converting high-quality research (blog) posts into academic papers, without requiring significant time investment from senior alignment researchers, especially ones without significant prior academic ML experience. This approach could potentially provide many of the benefits of mainstream academic publication, at a lower cost than asking senior alignment researchers conform to academic norms.
However, the grant might come with nontrivial downside risks or (additional, non-monetary) costs. Work of this nature might have significant negative capabilities externalities for the world, as improved model steering could increase commercial viability and accelerate the development of more capable AI systems. Of course, this concern also applies to other commercially relevant alignment efforts, such as RLHF (Reinforcement Learning from Human Feedback) and Constitutional AI. I and other LTFF fund managers have frequently found it difficult to reason about whether to support positive alignment efforts that nonetheless have nontrivial capabilities externalities, and getting this decision right is still a work in progress. (My own guess is that we made the right call in this case)
Outcomes: The two preprints, "Activation Addition: Steering Language Models Without Optimization" and "Understanding and Controlling a Maze-Solving Policy Network," are now available on arXiv. However, as of time of writing this payout report, I think they have not yet been formally published in a conference or journal.
Robert Miles ($121,575) - 1-year stipend + contractor and other expenses to continue his communications and outreach projects
The Long-Term Future Fund provided a grant of $121,575 to Robert Miles, administered through his fiscal sponsor, Ashgro Inc. The grant includes a $71,000 stipend for Rob and $50,000 for contractors and other expenses to support his work in producing YouTube videos, appearing on podcasts, helping researchers communicate their work, growing an online community to help newcomers get more involved, and building a large online FAQ at aisafety.info.
This grant continues the support we've given Rob in the past; we were his first significant funder and have been consistently happy with his progress. We believe Rob's historical impact with his outreach projects has been surprisingly large. His videos are of very high quality, with high overall production value and message fidelity. My understanding is that many technical researchers report satisfaction with the way their ideas are presented. Additionally, his videos are popular, with a typical Rob Miles video garnering between 100,000 and 200,000 views. While this is significantly lower than top technical YouTube channels, it has gained more traction than almost any other technical AI safety outreach to date.
The grant is also fairly cheap relative to the historical impact. Rob requested a $71,000 stipend, which is substantially lower than both his counterfactual earnings and the pay of other individuals who are competent in both technical AI safety and communications.
However, we think Rob's non-YouTube projects are less successful or impactful than his main YouTube channel. This is hard to definitively assess since much of it is based on private or hard-to-aggregate information, such as Rob privately advising researchers on how to communicate their work. Nonetheless, we (I) think it's unlikely that the non-YouTube work is as valuable as the videos. That said, the primary investigator of the grant believes Rob’s non-YouTube work is still significantly more valuable than our marginal grants and thus worth funding.
Another point against the grant is that Rob's YouTube channel productivity has been rather low, especially recently, with no new videos produced in the last year. However, he recently released a very long and high-quality video that, in addition to being a useful summary about many of the important events in AI (safety) over the last year, also goes into some detail about why he hasn't produced as much content recently.
Anonymous ($17,000) - Top-up stipend for independent research to evaluate the security of new biotechnology advances, outline vulnerabilities, propose solutions, and gain buy-in from relevant stakeholders
I was the primary evaluator of this grant, which provided a top-up stipend to an anonymous grantee for independent research on the security of new biotechnology advances. Originally, we evaluated the grantee for a 1-year stipend, but during our evaluation period, they secured external funding for most of the grant period. The grantee asked for the difference as a top-up stipend, which we provided.
To estimate the impact of this grant, I thought the work was quite valuable but had to defer to our advisors more than I would have liked. I asked an academic in the field to provide frank feedback on the grant and double-checked their reasoning with an external, non-academic biosecurity researcher. Both were positive about both the angle of attack and the applicant's fit for the role.
I believe this is an obviously important area to investigate within biosecurity ("big if true"), and I'm glad someone is looking into it. The applicant seemed fairly competent and impressive by conventional metrics, and advisors noted their unusually strong security mindset and background. We thought they could do a good job discreetly, which is hard to replicate with other funding arrangements (e.g., publish-or-perish in academia). We also thought the applicant was unusually suited for this work specifically, as the combination of biosecurity understanding and security mindset is rare. The applicant further seemed well-connected, which would otherwise be a major concern with independent work in threat analysis. Under some reasonable assumptions, I think this might have been LTFF's highest impact biosecurity grant last year, or at least in the top 3.
However, I thought the downside risks of doing this work well were nontrivial, and I also had to defer more than I'd like to our advisors that it's worth pushing ahead on. I don't love giving anonymous grants and have updated slightly against them over the last year. I think the original intent of anonymous/private grants (offering privacy for personally sensitive concerns or, less frequently, giving people an option to confidentially work on info-hazards) has been somewhat eroded. Plausibly, some of our grantees ask for anonymity just because having public funding sources can be annoying, which I think is understandable individually but creates a worse epistemic environment overall. However, please note that this is a personal update, and other fund managers may hold different views on anonymous grants (in an earlier draft, one fund manager specifically noted their disagreement with my update).
Unfortunately, both the grantee and I think it's better not to share every detail of this grant. Nonetheless, my hope is that sharing some details about our anonymous grants is helpful for donors, so that the ~10% of our grants that are anonymous do not look completely like a black box to small donors and other community members.
Other updates
- Our funding bar has been somewhat variable in the last year. It first went up at the end of 2022 in response to a decrease in the overall funding available to long-term future-focused projects, and then increased again a few times in response to liquidity issues within the Long-Term Future Fund itself, peaking at ~ September 2023.
- There was a 2:1 donation matching from Open Phil. That matching was completely filled, thanks to our generous donors.
- Thanks to the increase in funding, we were able to decrease our funding bar since our peak. We’re currently at around the early 2023 funding bar. I expect this bar to be relatively stable in the coming months compared to 2023, but I generally expect our funding bar to vary more over time and to depend more on individual donations than it has historically (in 2022 and earlier).
- Thank you to everybody who donated to us. Your contributions are key in supporting projects that we think are very valuable for the world.
- Longer-term, I’d like to seek out institutional funding and larger sources of individual funding, to help stabilize the fund and build out a longer runway.
- We’ve distanced ourselves from Open Phil since Aug 2023, to help increase diversity of perspectives and increase funder and grantee independence.
- We are in the process of spinning out of Effective Ventures, our fiscal sponsor.
- I (Linch Zhang) have joined EA Funds full-time, with a focus on LTFF.
Other writings
Compared to past years, LTFF and its fund managers have become substantially more active in writing and public communications. Here are relevant writings since our last payout report period:
- What Does a Marginal Grant at LTFF Look Like? Funding Priorities and Grantmaking Thresholds at the Long-Term Future Fund: A detailed discussion of “grantmaking thresholds” for marginal grants at LTFF. Essentially, given that we have limited resources and many good projects to fund, how do we choose which grants to make per $X we have? The post covers different projects we might want to fund at different thresholds ($X per 6 months).
- Select examples of adverse selection in longtermist grantmaking: I reviewed my past experiences with “adverse selection” as a grantmaker, that is, situations where we choose to not fund a project that initially looked good, often due to surprising and private information.
- LTFF and EAIF are unusually funding-constrained right now: Our fundraising post in September. We are in much less of a funding crunch now than we were in September, but the post may still be helpful for you to decide whether LTFF (or EAIF) are good donation targets relative to your next best alternative.
- The Long-Term Future Fund is looking for a full-time fund chair: Our hiring post for LTFF fund chair. Mostly a historical curiosity now that we’re no longer looking at new applications, but community members may be interested in reading it to understand the responsibilities and day-to-day of work at LTFF.
- Hypothetical grants that the Long-Term Future Fund narrowly rejected: A continuation of the marginal grants post, in that it’s a more narrow and tightly scoped list of hypothetical grants that are very close to our current funding bar. If you’re considering whether to fund LTFF or not, I think this post may be the best one in helping you decide what the most likely uses of your marginal dollars would end up actually funding.
- Please note that while the post was framed as hypothetical grants the LTFF narrowly rejected, our funding bar has decreased some since the publication of that post. So grants just barely below our past bar should be just barely above our current bar. Going forwards, in the next few months, donors should think of that post as “hypothetical grants that LTFF will narrowly accept."
- LessWrong comments discussion of whether longtermism or LTFF work has been net negative so far: A LW comments discussion of whether we should be worried about donating to LTFF, asking for assurances that LTFF will not fund net-negative work. Multiple fund managers offered their individual perspectives. Tl;dr: We are unfortunately unable to provide strong assurances. :/ Doing robustly good work in a highly speculative domain is very difficult, and fund managers are not confident that we can always be sure our work is good.
- I want to quickly note that the fund managers who commented (myself included) are not necessarily representative of LTFF, and my guess is we’re overall more negative/uncertain than the fund’s median.
- Lawrence Chan: What I would do if I wasn’t at ARC Evals: Lawrence Chan, a part-time guest fund manager at LTFF, discusses what he’d likely do if he wasn’t at ARC Evals (his day job). This might be relevant to community members considering career pivots in or into AI Safety/x-risk reduction.
- Lawrence, if you’re reading this, I hope you’d consider joining LTFF full-time! 😛
- Caleb Parikh on AI consciousness: Caleb Parikh, Project Lead of EA Funds and interim LTFF fund chair, discusses why he thinks the broader community is underinvesting in research projects working on AI consciousness.
- LTFF to be more stringent in evaluating mechanistic interpretability grants. We think funding and hiring in mechanistic interpretability is now less neglected outside of LTFF, thanks in large part due to recent advances in mechanistic interpretability (including from LTFF grantees!). So we’re increasing our bar for mechanistic interpretability grants, in part to help encourage technical AI safety work on other agendas.
- Please note that this is a mild/moderate technical change. We’re still broadly optimistic about mech interp work done and many of us will continue to encourage more people to work on it.
Appendix
Other Grants We Made During This Time Period
You can see a list of all of our public grants from this period here.
Footnotes
[1] Please note that the highlighted grants are likely to be unrepresentative of our average grant, and certainly of our marginal grant. To have a better sense of what marginally donations are likely to buy, please read Hypothetical grants that the Long-Term Future Fund narrowly rejected, my earlier post on this exact question.