Anthropic

Research Services

Anthropic is an AI safety and research company working to build reliable, interpretable, and steerable AI systems.

Discover all 1,040 employees

About us

We're an AI research company that builds reliable, interpretable, and steerable AI systems. Our first product is Claude, an AI assistant for tasks at any scale. Our research interests span multiple areas including natural language, human feedback, scaling laws, reinforcement learning, code generation, and interpretability.

Website: https://www.anthropic.com/
External link for Anthropic
Industry: Research Services
Company size: 501-1,000 employees
Type: Privately Held

Employees at Anthropic

See all employees

Updates

Anthropic

577,119 followers
1d
Report this post
Our new research paper: Adding Error Bars to Evals. AI model evaluations don’t usually include statistics or uncertainty. We think they should. Read the blog post: https://lnkd.in/d2jKfpyT When a new AI model is released, the accompanying model card typically reports a matrix of evaluation scores on a variety of standard evaluations, such as MMLU, GPQA, or the LSAT. But it’s unusual for these scores to include any indication of the uncertainty, or randomness, surrounding them. This omission makes it difficult to compare the evaluation scores of two models in a rigorous way. “Randomness” in language model evaluations may take a couple of forms. Any stream of output tokens from a model may be nondeterministic, and so re-evaluating the same model on the same evaluation may produce slightly different results each time. This randomness is known as measurement error. But there’s another form of randomness that’s not visible by the time an evaluation is performed. This is the sampling error; of all possible questions one could ask about a topic, we decide to include some questions in the evaluation, but not others. In our research paper, we recommend techniques for reducing measurement error and properly quantifying sampling error in model evaluations. With a simple assumption in place—that evaluation questions were randomly drawn from some underlying distribution—we develop an analytic framework for model evaluations using statistical theory. Drawing on the science of experimental design, we make a series of recommendations for performing evaluations and reporting the results in a way that maximizes the amount of information conveyed. Our paper makes five core recommendations. These recommendations will likely not surprise readers with a background in statistics or experimentation, but they are not standard in the world of model evaluations. Specifically, our paper recommends: 1. Computing standard errors using the Central Limit Theorem 2. Using clustered standard errors when questions are drawn in related groups 3. Reducing variance by resampling answers and by analyzing next-token probabilities 4. Using paired analysis when two models are tested on the same questions 5. Conducting power analysis to determine whether an evaluation can answer a specific hypothesis. For mathematical details on the theory behind each recommendation, read the full research paper here: https://lnkd.in/dBrr9zFi.

A statistical approach to model evaluations

anthropic.com

39 Comments

Like Comment Share
Anthropic

577,119 followers
1w
Report this post
We’ve added a new prompt improver to the Anthropic Console. Take an existing prompt and Claude will automatically refine it with prompt engineering techniques like chain-of-thought reasoning. The prompt improver also makes it easy to adapt prompts originally written for other AI models to work better with Claude. Read more: https://lnkd.in/dx-5sp5P.

103 Comments

Like Comment Share
Anthropic

577,119 followers
1w
Report this post
Coinbase customers now get faster and more accurate support with Claude powering their chatbot, help center search, and customer service teams across 100+ countries: https://lnkd.in/gWCvNy2u

Coinbase transforms their customer support with Claude

anthropic.com

23 Comments

Like Comment Share
Anthropic

577,119 followers
2w
Report this post
Read how Asana uses Claude to help 150,000+ companies automate workflows and save countless hours on tasks. https://lnkd.in/dee_PFDr

How Asana transforms work management with Claude for 150,000 global customers

anthropic.com

19 Comments

Like Comment Share
Anthropic

577,119 followers
2w
Report this post
Claude 3.5 Haiku is now available on our API, Amazon Bedrock, and Google Cloud's Vertex AI. Haiku is fast and particularly strong at coding. It outperforms state-of-the-art models—including GPT-4o—on SWE-bench Verified, which measures how models solve real software issues. During final testing, Haiku surpassed Claude 3 Opus, our previous flagship model, on many benchmarks—at a fraction of the cost. As a result, we've increased pricing for Claude 3.5 Haiku to reflect its increase in intelligence: anthropic.com/claude/haiku. Claude 3 Haiku remains available for use cases that benefit from image input or its lower price point: https://lnkd.in/e9yNTtNp.
98 Comments

Like Comment Share
Anthropic

577,119 followers
2w
Report this post
Claude can now view images within a PDF, in addition to text. Enable the feature preview to get started: https://claude.ai/new?fp=1. This helps Claude 3.5 Sonnet more accurately understand complex documents, such as those laden with charts or graphics. The Anthropic API now also supports PDF inputs in beta: https://lnkd.in/emvau9Ez

121 Comments

Like Comment Share
Anthropic

577,119 followers
2w
Report this post
See how Claude helps Hebbia deliver AI-powered document analysis to top financial and legal institutions, turning thousands of pages into actionable insights. https://lnkd.in/e4DnFAxs

How Hebbia uses Claude to transform knowledge work

anthropic.com

8 Comments

Like Comment Share
Anthropic

577,119 followers
3w
Report this post
You can now dictate messages to Claude on our iPhone, iPad, and Android apps. Download on Google Play: anthropic.com/android. Or on the Apple App Store: anthropic.com/ios.
34 Comments

Like Comment Share
Anthropic

577,119 followers
3w
Report this post
The Claude app is now available to download on Mac and Windows: claude.ai/download.
97 Comments

Like Comment Share
Anthropic

577,119 followers
3w
Report this post
Claude is now available on GitHub Copilot. Starting today, developers can select Claude 3.5 Sonnet in Visual Studio Code and GitHub.com. Access will roll out to all Copilot Chat users and organizations over the coming weeks. https://lnkd.in/eaJG3wwM

Claude 3.5 Sonnet on GitHub Copilot

anthropic.com

133 Comments

Like Comment Share

Browse jobs

Funding

Anthropic 10 total rounds

Last Round

Secondary market Jul 1, 2024

US$ 452.3M

Investors

G Squared + 10 Other investors

See more info on crunchbase

Anthropic

Research Services

Anthropic is an AI safety and research company working to build reliable, interpretable, and steerable AI systems.

About us

Employees at Anthropic

Ravi Mhatre

Founder, Managing Director, Lightspeed Venture Partners

Vinay Rao

T&S Anthropic

Byron Deeter Byron Deeter is an Influencer

Partner at Bessemer Venture Partners

Basil Hosmer

.

Updates

Join now to see what you are missing

Similar pages

OpenAI

Perplexity

Cohere

Hugging Face

Google DeepMind

Mistral AI

Generative AI

Inflection AI

NVIDIA

Google

Browse jobs

General Counsel jobs

Associate jobs

Analyst jobs

Head of Content jobs

Engineer jobs

Executive jobs

Manager jobs

Scientist jobs

Associate Product Manager jobs

Deputy General Counsel jobs

Operational Specialist jobs

Content Director jobs

Director jobs

Attorney jobs

Senior Operations Manager jobs

Assistant jobs

Account Manager jobs

Intern jobs

Editor jobs

Business Operations Lead jobs

Funding