Fake Bedrock

fake Bedrock server for local dev and tests. Real wire protocol, fake responses you control. 111 operations, deterministic, free. Not a mock library, not a real LLM.

Need a fake Bedrock? Use fakecloud. Real HTTP server that speaks the Bedrock wire protocol and returns whatever you tell it to. 111 operations, deterministic, free.

curl -fsSL https://raw.githubusercontent.com/faiscadev/fakecloud/main/install.sh | bash
fakecloud

Point any AWS SDK at http://localhost:4566.

What "fake" means here

Fake is not mock. A mock replaces the SDK internals with stub functions. A fake is a real running service that happens to return fake (configured) responses. The difference matters:

fakecloud is a fake AWS — a complete server on port 4566 that speaks 23 AWS services' wire protocols. Your code thinks it's talking to real AWS.

And it is, as far as the protocol is concerned. The part that's fake is the response content — which is exactly the part you want to control in tests.

Fake Bedrock, not fake LLM

fakecloud's Bedrock is deliberately not backed by a real model (no Ollama, no local Claude). Here's why:

So Bedrock in fakecloud returns what you configure. That's the whole product:

await fc.bedrock.setResponseRule({
  whenPromptContains: "summarize",
  respond: { completion: "- one\n- two\n- three" },
});

Now InvokeModel with a prompt containing "summarize" returns exactly that. Every time. In milliseconds.

111 operations

The control plane is real. Create guardrails, customization jobs, imported models, inference profiles, provisioned throughput — all the Bedrock shape your infrastructure code deals with:

import boto3
bedrock = boto3.client('bedrock',
    endpoint_url='http://localhost:4566',
    aws_access_key_id='test', aws_secret_access_key='test', region_name='us-east-1')

# guardrail crud
g = bedrock.create_guardrail(
    name='safety-v1',
    contentPolicyConfig={'filtersConfig': [
        {'type': 'SEXUAL', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
    ]},
    blockedInputMessaging='blocked',
    blockedOutputsMessaging='blocked',
)

v = bedrock.create_guardrail_version(guardrailIdentifier=g['guardrailId'])
bedrock.list_guardrails()

# customization job (fine-tuning flow shape)
bedrock.create_model_customization_job(
    jobName='my-ft-job',
    customModelName='my-claude-ft',
    roleArn='arn:aws:iam::000000000000:role/bedrock',
    baseModelIdentifier='anthropic.claude-3-haiku-20240307-v1:0',
    trainingDataConfig={'s3Uri': 's3://fakecloud-demo/train.jsonl'},
    outputDataConfig={'s3Uri': 's3://fakecloud-demo/output/'},
    hyperParameters={'epochCount': '1'},
)

Terraform, CloudFormation, and CDK that deploy Bedrock infra can all target fakecloud.

Fault injection

await fc.bedrock.injectFault({
  operation: "Converse",
  error: "ValidationException",
  messageSubstring: "The model returned the following errors: too many messages",
});

// your code catches ValidationException, falls back to single-turn mode
const result = await resilientConverse(rt, longConversation);
expect(result.mode).toBe("single-turn-fallback");

Assert your code handles every Bedrock error code you read in the docs. No more "I'll test that path later."

Comparison

ToolTypeWire protocolDeterministicCost
fakecloudEmulator (fake server)RealYesFree
Real BedrockCloud serviceRealNo$$$ per token
Ollama / LM StudioReal local LLMDifferentNoCPU+RAM
LocalStack UltimateEmulator with OllamaRealNo (Ollama)Paid Ultimate tier
Moto / aws-sdk-client-mockMock libraryNone (no HTTP)YesFree, but misses HTTP bugs