Bedrock emulator
Bedrock emulator for tests: 111 operations, real wire protocol, fault injection, configurable responses. Deterministic, offline, free. Not a mock library, not a real LLM.
Need a Bedrock emulator? Use fakecloud. Not a mock library. Not a real LLM. A real server that speaks the Bedrock wire protocol and returns exactly what you tell it to.
curl -fsSL https://raw.githubusercontent.com/faiscadev/fakecloud/main/install.sh | bash
fakecloudPoint any AWS SDK at http://localhost:4566. That's the setup.
Why emulator, not mock, not Ollama
Three categories of tools, three different jobs:
- Mock library (moto, aws-sdk-client-mock) — replaces the AWS SDK internals with stubs. Never hits HTTP. Your code isn't tested against anything that looks like Bedrock — it's tested against a fake object matching the SDK interface. If you assemble the request wrong, the mock doesn't care.
- Real LLM locally (Ollama, LM Studio) — real inference on your CPU. Slow. Non-deterministic. GBs of weights. Your tests can't assert on output because every run returns different text. A CI runner takes 8 minutes to boot the model. LocalStack's "Ultimate"-tier Bedrock does this.
- Emulator (fakecloud) — real HTTP server on port 4566. Speaks Bedrock's JSON wire protocol. Your AWS SDK sends real requests. You control the response per prompt. Deterministic. Millisecond latency. No GBs of weights.
Tests want the third one. You're testing your code, not whether the model happens to understand the prompt today.
What fakecloud Bedrock does
111 operations across the full Bedrock surface, not just the runtime:
- Runtime:
InvokeModel,InvokeModelWithResponseStream,Converse,ConverseStream— EventStream binary encoded, same as real AWS. - Guardrails: full CRUD + versioning + apply + content evaluation. Test your guardrail config locally before deploying.
- Custom models:
CreateModelCustomizationJob,GetCustomModel,ListCustomModels, deletion, versioning. - Model import:
CreateModelImportJobfor bring-your-own weights flow. - Inference profiles, provisioned throughput, model copy jobs, async invoke + batch inference via S3.
- Prompt management: prompts + prompt routers with versioning.
- Foundation model agreements: the
PutUseCaseForModelAccess/CreateFoundationModelAgreementaccess approval flow. - Automated reasoning policies, evaluation jobs, marketplace, resource policies, logging configuration.
What it does NOT do: actual inference. Response content is whatever you configured. That's the point — deterministic tests.
Configurable responses per prompt
import { FakeCloud } from "fakecloud";
import {
BedrockRuntimeClient,
InvokeModelCommand,
} from "@aws-sdk/client-bedrock-runtime";
const fc = new FakeCloud();
beforeEach(() => fc.reset());
test("spam classifier routes to review on borderline score", async () => {
await fc.bedrock.setResponseRule({
whenPromptContains: "classify spam",
respond: {
completion: JSON.stringify({ label: "borderline", confidence: 0.62 }),
},
});
const rt = new BedrockRuntimeClient({ endpoint: "http://localhost:4566" });
const out = await classifyEmail(rt, "buy now!!");
expect(out.routedTo).toBe("human-review");
});Fault injection
await fc.bedrock.injectFault({
operation: "InvokeModel",
error: "ThrottlingException",
count: 2,
});
// your code retries with backoff; third call succeeds
const out = await yourClassifier(rt, "buy now!!");
expect(fc.bedrock.getCallHistory()).toHaveLength(3);Real retry code paths get exercised. No more "we'll test the retry logic later."
Guardrails, locally
import boto3
bedrock = boto3.client('bedrock', endpoint_url='http://localhost:4566',
aws_access_key_id='test', aws_secret_access_key='test', region_name='us-east-1')
g = bedrock.create_guardrail(
name='no-pii',
contentPolicyConfig={
'filtersConfig': [
{'type': 'HATE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
],
},
sensitiveInformationPolicyConfig={
'piiEntitiesConfig': [{'type': 'EMAIL', 'action': 'BLOCK'}],
},
blockedInputMessaging='Blocked.',
blockedOutputsMessaging='Blocked.',
)
# your app flows
rt = boto3.client('bedrock-runtime', endpoint_url='http://localhost:4566',
aws_access_key_id='test', aws_secret_access_key='test', region_name='us-east-1')
rt.apply_guardrail(guardrailIdentifier=g['guardrailId'],
guardrailVersion='DRAFT', source='INPUT',
content=[{'text': {'text': 'contact me at bob@example.com'}}])ApplyGuardrail runs real content evaluation. PII action BLOCK returns the blocked message. Test your guardrail policy without AWS.
Comparison
| Tool | Wire protocol | Deterministic | Guardrails | Fine-tuning jobs | Async batch | Price |
|---|---|---|---|---|---|---|
| fakecloud | Real | Yes | Yes | Yes (control plane) | Yes | Free, AGPL-3.0 |
| Real Bedrock | Real | No | Yes | Yes | Yes | $$$ per token |
| LocalStack Ultimate (Ollama) | Real | No (Ollama inference) | No | No | Partial | Paid Ultimate tier |
| Ollama standalone | Different (OpenAI-ish) | No | No | No | No | Free, but doesn't speak Bedrock |
| Mock library (moto, etc.) | N/A (no HTTP) | Yes | Stubbed | Stubbed | Stubbed | Free |
Links
- Install:
curl -fsSL https://raw.githubusercontent.com/faiscadev/fakecloud/main/install.sh | bash - Repo: github.com/faiscadev/fakecloud
- Deep dive: How to test Bedrock code locally, for free, deterministically
- LLM Guardrails: Testing LLM guardrails locally
- Related: Fake AWS server for tests, Test Lambda locally