Overview
Most AI programs eventually touch documents that were never meant for a model context: clinical notes, matter files, HR exports, customer correspondence. De-ID Studio exists for the gap between "files in a bucket" and "safe to embed." It is a batch processor, not an API gateway - you run a task, it finishes, it exits with a clear status code.
Typical buyers include platform engineers wiring document prep into MLOps, compliance leads who want evidence without another SaaS data processor, and legal or healthcare-adjacent teams preparing corpora for retrieval-augmented generation. The tool supports your compliance program; it does not certify HIPAA, GDPR, or Safe Harbor outcomes on your behalf. You decide whether content is PHI or PII and whether output meets your policy.
A single run lists input objects (local path or S3 prefix), detects identifier spans, resolves overlaps, transforms text per category strategy, and writes outputs beside an audit record. Audit entries report counts and strategies - never the original matched strings. Pseudonymization, when enabled, stores an encrypted mapping file at a separate path you configure; it is not co-located with de-identified documents or the audit log.
Formats in v1: plain text, CSV (cell-aware), and PDF with an extractable text layer. Scanned PDFs without text fail by default; a fallback mode can emit text-only output. DOCX is not supported in this release.
Deployment posture: run inside your VPC. Core matching is fully offline. Network use is limited to optional S3 reads and writes you configure. Templates for Docker Compose, ECS Fargate RunTask, and EKS Job ship with the product. Environment variables control input/output sinks, category filters, strategy maps, concurrency, and audit destination.
Pricing: pay per page processed on demand, or choose a monthly plan with included page volume and lower overage rates. Failed documents (for example, unreadable PDFs) are not billed.
From BluePeak: we build privacy-forward batch tooling for regulated document workflows. Documentation, configuration reference, and deployment guides live at loopwell.net/docs.
Highlights
- Process sensitive documents without sending them to a third-party de-identification API. Matching runs on bundled rules and name dictionaries inside your container - suitable for air-gapped or strict data-residency environments.
- Choose redact, mask, or pseudonymize independently for each identifier category. Pseudonym mappings are AES-encrypted and written to a path you control, separate from output files and audit logs.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/month |
|---|---|---|
De-ID Studio Standard | Monthly contract for production document de-identification pipelines. Includes 10,000 pages per month. All seven identifier categories, redact/mask/pseudonymize per category, local and S3 input/output, count-only audit logging, and ECS/EKS deployment templates. Priority email support during business hours. Annual billing available (10% discount). | $299.00 |
De-ID Studio Enterprise | Monthly contract for high-volume regulated environments. Includes 50,000 pages per month. All Standard capabilities plus dedicated support channel, custom SLA options, security questionnaire support, and multi-region deployment guidance. For volume beyond 50,000 pages per month, contact BluePeak for a private offer. | $3,500.00 |
Vendor refund policy
BluePeak LLC will review refund requests for first-time De-ID Studio subscriptions made through AWS Marketplace if submitted within fourteen (14) calendar days of the initial charge. Contact contact@loopwell.net with your AWS account ID, product code, and purchase date. Renewals and private offers are excluded. Refunds are issued per AWS Marketplace refund procedures after approval.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Batch de-identification job container
- Amazon ECS
- Amazon EKS
Container image
Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.
Version release notes
Patch release with detection fixes and no breaking configuration changes.
Fixes in 1.0.1
Street addresses now match common suffixes including Terrace, Circle, Parkway, and Highway (long forms matched before abbreviations) Title-based name detection no longer treats lowercase verbs after "Dr." as surnames (e.g. "Dr. Smith reviewed chart" stays intact)
Unchanged from 1.0.0
Batch ingest from local directory or S3; output to local path or S3 Inputs: .txt, .csv, text-layer .pdf Seven identifier categories (names, addresses, phones, emails, government IDs, account numbers, individual-linked dates) Strategies per category: redact, mask, pseudonymize (encrypted mapping file) Exit codes: 0 success, 1 partial failure, 2 fatal config error Offline detection; optional S3 I/O only Deployment templates: docker-compose, ECS Fargate task definition, EKS Job
Known limitations
PDF text layer only; scanned PDFs fail unless PDF_FALLBACK=text DOCX not supported Rule/dictionary detection may miss or over-match; validate on your own samples
Additional details
Usage instructions
De-ID Studio is a batch job (not a web service). Subscribe, pull the image, run once, inspect /output files and audit.json. Exit 0 = all files OK; 1 = some files failed; 2 = config error.
IMAGE AND PATHS
export AWS_REGION=us-east-1 export IMAGE_URI=709825985650.dkr.ecr.us-east-1.amazonaws.com/blue-peak/de-id:1.0.1 export TEST_DIR=$HOME/deid-studio-test mkdir -p $TEST_DIR/input $TEST_DIR/output
STEP 1 - LOGIN AND PULL
aws ecr get-login-password --region $AWS_REGION | docker login --username AWS --password-stdin 709825985650.dkr.ecr.us-east-1.amazonaws.com docker pull $IMAGE_URI
STEP 2 - CREATE SAMPLE INPUT FILES
cat > $TEST_DIR/input/note.txt <<'EOF' Patient: Jane Sample DOB: 03/15/1985 SSN: 123-45-6789 Email: jane.sample@example.com Phone: (555) 234-5678 Address: 742 Evergreen Terrace, Springfield, IL 62704 Notes: Dr. Smith reviewed chart. Account ending in 4111111111111111 on file. EOF
cat > $TEST_DIR/input/records.csv <<'EOF' name,email,phone,notes Jane Sample,jane.sample@example.com ,555-234-5678,Patient visit John Example,john.example@example.com ,555-987-6543,Follow-up EOF
STEP 3 - TEST DEFAULT REDACT (all categories)
rm -rf $TEST_DIR/output/*
docker run --rm
-v $TEST_DIR/input:/input:ro
-v $TEST_DIR/output:/output
-e INPUT_SOURCE=local -e INPUT_PATH=/input
-e OUTPUT_SINK=local -e OUTPUT_PATH=/output
-e AUDIT_LOG_PATH=/output/audit.json
$IMAGE_URI
echo "exit code (expect 0): $?"
VERIFY REDACT: grep '[NAME]' $TEST_DIR/output/note.txt grep '[ADDRESS]' $TEST_DIR/output/note.txt grep 'Dr. Smith reviewed chart' $TEST_DIR/output/note.txt grep '[EMAIL]' $TEST_DIR/output/records.csv cat $TEST_DIR/output/audit.json
STEP 4 - TEST EMAIL-ONLY FILTER
rm -rf $TEST_DIR/output/*
docker run --rm
-v $TEST_DIR/input:/input:ro
-v $TEST_DIR/output:/output
-e INPUT_SOURCE=local -e INPUT_PATH=/input
-e OUTPUT_SINK=local -e OUTPUT_PATH=/output
-e AUDIT_LOG_PATH=/output/audit.json
-e IDENTIFIER_CATEGORIES=emails
$IMAGE_URI
grep '[EMAIL]' $TEST_DIR/output/note.txt
grep 'Jane Sample' $TEST_DIR/output/note.txt
STEP 5 - TEST MASK STRATEGY
echo 'SSN: 123-45-6789' > $TEST_DIR/input/mask.txt
rm -rf $TEST_DIR/output/*
docker run --rm
-v $TEST_DIR/input:/input:ro
-v $TEST_DIR/output:/output
-e INPUT_SOURCE=local -e INPUT_PATH=/input
-e OUTPUT_SINK=local -e OUTPUT_PATH=/output
-e AUDIT_LOG_PATH=/output/audit.json
-e 'STRATEGY_CONFIG={"gov_ids":"mask","emails":"mask"}'
$IMAGE_URI
grep '***-**-6789' $TEST_DIR/output/mask.txt
STEP 6 - TEST PSEUDONYMIZE
echo 'Patient Jane Sample' > $TEST_DIR/input/pseudo.txt
export MAPPING_KEY=$(openssl rand -hex 32)
rm -rf $TEST_DIR/output/*
docker run --rm
-v $TEST_DIR/input:/input:ro
-v $TEST_DIR/output:/output
-e INPUT_SOURCE=local -e INPUT_PATH=/input
-e OUTPUT_SINK=local -e OUTPUT_PATH=/output
-e AUDIT_LOG_PATH=/output/audit.json
-e 'STRATEGY_CONFIG={"names":"pseudonymize"}'
-e MAPPING_OUTPUT_PATH=/output/mapping.bin
-e MAPPING_ENCRYPTION_KEY=$MAPPING_KEY
$IMAGE_URI
ls -l $TEST_DIR/output/mapping.bin
cat $TEST_DIR/output/pseudo.txt
STEP 7 - TEST CONFIG ERROR (expect exit 2)
docker run --rm $IMAGE_URI; echo "exit code (expect 2): $?"
STEP 8 - ECS / EKS
Use fulfillment templates deploy/templates/ecs-run-task.json and deploy/templates/eks-job.yaml. Set image to $IMAGE_URI, configure S3 buckets, task role s3:GetObject on input prefix and s3:PutObject on output prefix, then RunTask or apply Job.
DOCS: https://loopwell.net/docs/configuration SUPPORT: contact@loopwell.net
Support
Vendor support
BluePeak LLC supports De-ID Studio subscribers by email at contact@loopwell.net .
Documentation: https://loopwell.net/docs/quickstart (quickstart), https://loopwell.net/docs/deployment (ECS/EKS/local), https://loopwell.net/docs/configuration (environment variables), https://loopwell.net/docs/compliance (customer responsibilities and limitations).
Billing: subscriptions and usage charges flow through AWS Marketplace only. BluePeak does not sell this product via direct invoice, wire transfer, or card checkout outside Marketplace.
We help with: first-run configuration, S3 IAM policies for input/output buckets, strategy JSON setup, interpreting audit output, and upgrading container tags between versions.
Response expectation: business-day email reply for Standard and Enterprise plans; best-effort within three business days for pay-as-you-go.
Out of scope: legal opinions on regulatory status, custom OCR for scanned PDFs, on-site professional services (available under separate engagement), and guarantees that output meets your specific compliance framework.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products
