x.com Open in urlscan Pro
104.244.42.129  Public Scan

Submitted URL: https://x.com/anthropicai?campaignId=11453111&source=i_email&medium=email&content=Oct2024Sonnet&messageTypeId=...
Effective URL: https://x.com/anthropicai?campaignId=11453111&source=i_email&medium=email&content=Oct2024Sonnet&messageTypeId=...
Submission: On October 26 via manual from IN — Scanned from US

Form analysis 0 forms found in the DOM

Text Content

JAVASCRIPT IS NOT AVAILABLE.

We’ve detected that JavaScript is disabled in this browser. Please enable
JavaScript or switch to a supported browser to continue using x.com. You can see
a list of supported browsers in our Help Center.

Help Center

Terms of Service Privacy Policy Cookie Policy Imprint Ads info © 2024 X Corp.

Don’t miss what’s happening
People on X are the first to know.
Log in
Sign up
Welcome to x.com!
We are letting you know that we are changing our URL, but your privacy and data
protection settings remain the same.
For more details, see our Privacy Policy:
https://x.com/en/privacy



Settings
ANTHROPIC

671 posts

See new posts

Follow
Click to Follow AnthropicAI
Anthropic

@AnthropicAI

We're an AI safety and research company that builds reliable, interpretable, and
steerable AI systems. Talk to our AI assistant Claude at http://Claude.ai.
anthropic.comJoined January 2021
33 Following
371.5K Followers
Posts

Replies

Highlights

Media




ANTHROPIC’S POSTS

Pinned
Anthropic

@AnthropicAI
·
Oct 22

Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku.
We’re also introducing a new capability in beta: computer use. Developers can
now direct Claude to use computers the way people do—by looking at a screen,
moving a cursor, clicking, and typing text.

488
3.1K
10K
3.3M



Anthropic

@AnthropicAI
·
20h

Over the past few months, our Interpretability team has put out a number of
smaller research updates. Here’s a thread of some of the things we've been up
to:
13
131
1.2K
135K


Show more replies
Anthropic

@AnthropicAI
·
20h

We also report on reproducing Kharlapenko et al.'s "self explaining sparse
autoencoder" features experiments:
https://transformer-circuits.pub/2024/august-update/index.html#self-explaining-sae…
1
1
69
17K


Anthropic

@AnthropicAI
·
20h

And finally, you might have seen our collaboration with Anthropic’s Societal
Impacts team on Feature Steering, also published today:
Quote
Anthropic

@AnthropicAI
·
Oct 25
New Anthropic research: Evaluating feature steering. In May, we released Golden
Gate Claude: an AI fixated on the Golden Gate Bridge due to our use of “feature
steering”. We've now done a deeper study on the effects of feature steering.
Read the post: http://anthropic.com/research/evaluating-feature-steering…
Show more

2
4
90
27K



Anthropic

@AnthropicAI
·
Oct 25

New Anthropic research: Evaluating feature steering. In May, we released Golden
Gate Claude: an AI fixated on the Golden Gate Bridge due to our use of “feature
steering”. We've now done a deeper study on the effects of feature steering.
Read the post: http://anthropic.com/research/evaluating-feature-steering…

22
185
1.1K
132K


Show more replies
Anthropic

@AnthropicAI
·
Oct 25

Finally, we discovered a feature that significantly reduces bias scores across
nine social dimensions within the sweet spot. This did come with a slight
capability drop, which highlights potential trade-offs in feature steering.

4
15
145
48K


Anthropic

@AnthropicAI
·
Oct 25

We hope our preliminary findings inspire further research on steering methods
for safer, more reliable model outputs. Read the full research blog for detailed
results, insights, and limitations:
"Evaluating feature steering: A case study in mitigating social biases"

From anthropic.com
5
4
94
15K



Anthropic

@AnthropicAI
·
Oct 24

Claude can now write and run code. We've added a new analysis tool. The tool
helps Claude respond with mathematically precise and reproducible answers. You
can then create interactive data visualizations with Artifacts. Enable the
feature preview: https://claude.ai/new?fp=1.

171
945
4.6K
515K


NEW TO X?


Sign up now to get your own personalized timeline!
Sign up with Apple
Create account
By signing up, you agree to the Terms of Service and Privacy Policy, including
Cookie Use.
YOU MIGHT LIKE


 * Sam Altman
   
   @sama
   Follow
   Click to Follow sama
 * Geoffrey Hinton
   
   @geoffreyhinton
   Follow
   Click to Follow geoffreyhinton
 * Hugging Face
   
   @huggingface
   Follow
   Click to Follow huggingface

Show more
Something went wrong. Try reloading.
Retry
Terms of ServicePrivacy PolicyCookie PolicyAccessibilityAds info
More
© 2024 X Corp.