John(Yueh-Han) Chen (@jcyhc_ai) Twitter Tweets • TwiCopy

John(Yueh-Han) Chen

@jcyhc_ai

+ Follow

AI Research @berkeley_ai

ID:1698607819149418496

linkhttps://www.linkedin.com/in/yueh-han-chen/ calendar_today04-09-2023 08:04:10

15 Tweets

99 Followers

231 Following

John Yang

@jyangballin

1 month ago

SWE-agent is our new system for autonomously solving issues in GitHub repos. It gets similar accuracy to Devin on SWE-bench, takes 93 seconds on avg + it's open source!

We designed a new agent-computer interface to make it easy for GPT-4 to edit+run code
github.com/princeton-nlp/…

account_circle

Carlos E. Perez

@IntuitMachine

2 months ago

1/n Are AI Forecasters the Next Big Thing? How A.I. Learned to Make Freaky Accurate Forecasts

New research unveils an ingenious system that unlocks the untapped potential of artificial intelligence to predict the future. Drawing on the burgeoning prowess of language models to…

account_circle

Owain Evans

@OwainEvans_UK

3 months ago

Great paper, showing impressive performance from LLMs for forecasting using an elegant retrieval, reasoning + finetuning approach. I expect further improvements in forecasting are possible with current LLMs with more iteration.

thumb_up_off_alt17

chat_bubble_outline0

repeat6

shareShare

account_circle

Fred Zhang

@FredZhang0

3 months ago

Beating prediction markets with chatbots sounds cool. In a recent work arxiv.org/abs/2402.18563, we get somewhat close to that.

As another perspective, forecasting is a great capability domain to benchmark LM reasoning, calibration, pre-training knowledge, and more. 🧵1/n

account_circle

Jacob Steinhardt

@JacobSteinhardt

3 months ago

Can we build an LLM system to forecast geo-political events at the level of human forecasters?

Introducing our work Approaching Human-Level Forecasting with Language Models!

Arxiv: arxiv.org/abs/2402.18563
Joint work with Danny Halawi, Fred Zhang, and John(Yueh-Han) Chen

account_circle

Dan Hendrycks

@DanHendrycks

3 months ago

GPT-4 with simple engineering can predict the future around as well as crowds:
arxiv.org/abs/2402.18563
On hard questions, it can do better than crowds.

If these systems become extremely good at seeing the future, they could serve as an objective, accurate third-party. This would…

account_circle